Reply

Divine intervention? - Lost connection between master and member & bug into DNS databse

Techie
Posts: 10
2403     0

Hello,

 

During an operation of election of the master candidate in master then backtrack I met the following problem:

 

1/ The election of the Master candidate and the backtracking were no problem.

2/ The connection of members to the master was impossible:
      a/ In the database a static CNAME record had been created (www -> example.org.)
      b/ Type DTC record was created (www IN A 127.0.0.1)
These 2 records have been accepted by the WebUI Infoblox

 

An extract of the logs of the operation:

[2019/09/19 07:01:27.785] (16963 /infoblox/one/bin/db_import_3x) ssindex.c:3354 onedb_import_fqdn_check_conflict(): Uniqueness Violation:Fqdn conflict with existing object of type = .com.infoblox.dns.bind_a zone = ._default.org.example
[2019/09/19 07:01:27.785] (16963 /infoblox/one/bin/db_import_3x) ssindex.c:3730 onedb_import_ssindex_conflict_check_for_zone(): Object with fqdn conflict of type .com.infoblox.dns.bind_cname and __key = ._default.org.example.www in zone=._default.org.example rrname = www fqdn = ._default.org.example.www
[2019/09/19 07:01:27.785] (16963 /infoblox/one/bin/db_import_3x) ssindex.c:3733 onedb_import_ssindex_conflict_check_for_zone(): general failure
[2019/09/19 07:01:27.785] (16963 /infoblox/one/bin/db_import_3x) ssindex.c:3904 onedb_import_ssindex_conflict_check(): general failure
[2019/09/19 07:01:27.785] (16963 /infoblox/one/bin/db_import_3x) import_utils.c:1126 onedb_create_indexes_post_import(): general failure
[2019/09/19 07:01:27.786] (16963 /infoblox/one/bin/db_import_3x) main.c:572 main(): general failure
[2019/09/19 07:01:27.786] (16963 /infoblox/one/bin/db_import_3x) main.c:670 main(): avoiding import panic bug - ignore handle leak message
[ TIME NOT KNOWN ] (16963) db_local.c:__bdb_handle_leak_messg{}: ib_err_log: process exiting with bdb-handles open 4
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) snapshot.c:516 import_xml_dump(): db_import failed, exit status = 1
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) snapshot.c:517 import_xml_dump(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) snapshot.c:912 jnld_replica_snapshot_receive(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) snapshot.c:997 jnld_replica_snapshot_process(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) negotiate.c:704 jnld_replica_sync_with_master(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) main.c:494 __call_replica_dbsync(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) main.c:4337 db_jnld_handle_connection(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) main.c:4548 db_jnld_listen_loop(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) main.c:4705 main(): general failure
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) main.c:4837 main(): db_jnld exiting on its own
[2019/09/19 07:01:27.891] (15821 db_jnld_recv - 169.254.0.1) main.c:4840 main(): PREMATURE DEATH
[2019/09/19 07:01:27.916] (14760 /infoblox/one/bin/clusterd) procmgr.c:655 cd_procmgr_check_process_death(): Journal Receive Queue Died! (PID 15821, exit status 1)
[2019/09/19 07:01:27.917] (14760 /infoblox/one/bin/clusterd) replica.c:3046 cd_replica_fatal_master_exception(): Lost connection to grid master.

 

3/ To correct this problem I had planned in HNO on each member to force the update of the database:

    a/ reset database
    b/ reboot
    c/ set membership

 

What was my surprise when checking before removing the database to see that members were new to the master.
Between the operation of election of a new master and my operation described in point '3' it took place 48 hours.

 

After analyzing the logs nothing found.

 

Can someone provide me with documentation describing what may have happened?

 

Thank you for your help.

Regards

Eric DUVAL

Showing results for 
Search instead for 
Did you mean: 

Recommended for You