Skip to content

Failure to check file contents causes node to crash during start up [JIRA: RIAK-2162] #708

@jenafermiller

Description

@jenafermiller

In lines 1364-1375 of gen_leader.erl, line 1366 verifies that the file needed to start up the replication leader exists and line 1371 reads the file. However, if the file is empty, corrupted, or otherwise contains bytes that do not form a valid external term format line 1372 will throw an exception. This exception causes the gen_leader start up process to be killed which, in turn, causes the riak_repl process to fail to start, and eventually the entire node is taken down during start up.

To fix this issue, the current version of line 1372 could be replaced with the following line so that any non-integer value or error is ignored. This will prevent Riak from starting when the disk is full, when the data directory is read only, etc. however will not ignore file read errors or file write errors.

Incarn = case catch binary_to_term(Bin) of
            I when is_integer(I) -> I;
            _ -> 1
         end,

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions