Heads Up! Log Insight Fails to Start with: Cannot Connect

I have heard of a few people who have restarted the Log Insight 2.5 virtual appliance and found that the Log Insight service failed to start with a Cassandra error stating it cannot connect. In this post, I would like to discuss how you can address the issue.

li-logo

Symptoms

If you restart the Log Insight service or restart the Log Insight virtual appliance then Log Insight may fail to start with an error similar to:

# service loginsight start
Log Insight did not shutdown cleanly. Cleaning up.
Starting Log Insight...
Error message unavailable.
StartupException(description:com.VMware.loginsight.daemon.LogInsightDaemon$StartupFailedException: Daemon startup failed: Failed to start Cassandra Server: StartupException(description:Unable to connect to Cassandra node at localhost:9042: com.VMware.loginsight.cassandra.CassandraException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1 (com.datastax.driver.core.TransportException: [localhost/127.0.0.1] Cannot connect))).)
at com.VMware.loginsight.daemon.protocol.commands.DaemonCommands$waitUntilStarted_result.read(DaemonCommands.java:5818)
at com.VMware.loginsight.daemon.protocol.commands.DaemonCommands$Client.recv_waitUntilStarted(DaemonCommands.java:1832)
at com.VMware.loginsight.daemon.protocol.commands.DaemonCommands$Client.waitUntilStarted(DaemonCommands.java:1808)
at com.VMware.loginsight.daemon.protocol.commands.DaemonCommands$PooledClient.waitUntilStarted(DaemonCommands.java:1476)
at com.VMware.loginsight.daemon.protocol.commands.DaemonCommands$ClientStub.waitUntilStarted(DaemonCommands.java:375)
at com.VMware.loginsight.daemon.protocol.commands.DaemonCommands$ClientStub.waitUntilStarted(DaemonCommands.java:389)
at com.VMware.loginsight.daemon.DaemonClient.waitUntilStarted(DaemonClient.java:56)
at com.VMware.loginsight.admintool.LICommandlineAdminTool.startDaemon(LICommandlineAdminTool.java:1287)
at com.VMware.loginsight.admintool.LICommandlineAdminTool.access$300(LICommandlineAdminTool.java:85)
at com.VMware.loginsight.admintool.LICommandlineAdminTool$StartTask.process(LICommandlineAdminTool.java:562)
at com.VMware.loginsight.admintool.LICommandlineAdminTool$StartTask.process(LICommandlineAdminTool.java:539)
at com.VMware.loginsight.commons.actions.commandline.AbstractCommandLineAction.handleOwnArgs(AbstractCommandLineAction.java:149)
at com.VMware.loginsight.commons.actions.commandline.AbstractCommandLine.handle(AbstractCommandLine.java:42)
at com.VMware.loginsight.commons.actions.commandline.AbstractCommandLineCategory.handleOwnArgs(AbstractCommandLineCategory.java:97)
at com.VMware.loginsight.commons.actions.commandline.AbstractCommandLine.handle(AbstractCommandLine.java:42)
at com.VMware.loginsight.admintool.LICommandlineAdminTool.run(LICommandlineAdminTool.java:467)
at com.VMware.loginsight.admintool.LICommandlineAdminTool.main(LICommandlineAdminTool.java:335)

Note: This issue could impact standalone instances and/or one or more nodes in a cluster, but ONLY applies to Log Insight version 2.5.

Cause

The reason Log Insight fails to start is because the Cassandra database takes too long to start. There are several reasons why the database could take a long time to start including:

  • Log Insight is improperly sized and/or Log Insight is resource constrained. Please refer to the Getting Started guide for sizing requirements.
  • The database requires repair. While repair operations runs automatically they may fail for a variety of reasons.

Resolution

VMware has released a KB article with the steps required to manually perform a repair operation on the database, which should allow Log Insight to start. Given that the process requires multiple manual steps, I have written a quick script. To use the script, simply run the command and specify the repair flag:

# sh li-cassandra.sh
USAGE: li-cassandra.sh [--status | --repair]
# sh li-cassandra.sh --repair
USAGE: li-cassandra.sh --repair --force
WARNING: This command will stop the Log Insight service.
# sh li-cassandra.sh --repair --force
Stopping Log Insight...done
Starting Cassandra.....done
Waiting for node(s)....done
Repairing Cassandra....done
Stopping Cassandra.....done
Starting Log Insight...done

That’s it! Of course, if you are experiencing a different issue the last step of starting Log Insight will likely fail. Note that there is no harm in running a repair operation so when in doubt try this script. With that, here is the script:

#!/usr/bin/env bash
CASSANDRA_BIN=/usr/lib/loginsight/application/lib/apache-cassandra-*/bin
CASSANDRA_CONF=/storage/core/loginsight/cidata/cassandra/config
status(){
$CASSANDRA_BIN/nodetool status -k logdb
exit
}
repair() {
if [ "$1" != "--force" ]; then
echo "USAGE: $0 --repair --force"
echo "WARNING: This command will stop the Log Insight service."
exit
fi
echo -n "Stopping Log Insight..."
OUTPUT=$(service loginsight stop)
if [ "$?" -ne "0" -a "$?" -ne "52" ]; then
echo "FAILED"
echo
echo "$OUTPUT"
exit
else echo "done"; fi
echo -n "Starting Cassandra....."
OUTPUT=$($CASSANDRA_BIN/nodetool stopdaemon)
OUTPUT+=$($CASSANDRA_BIN/cassandra)
if [ "$?" -ne "0" ]; then
echo "FAILED"
echo
echo "$OUTPUT"
echo
echo "Unable to start Cassandra, check the above output and cassandra.log for clues."
exit
else echo "done"; fi
echo -n "Waiting for node(s)...."
for (( i=1; i<5; ++i)); do
sleep 30
if [ "$($CASSANDRA_BIN/nodetool status | grep rack1 | wc -l)" -eq "$($CASSANDRA_BIN/nodetool status | grep '^UN ' | wc -l)" ]; then
echo "done"
echo -n "Repairing Cassandra...."
OUTPUT=$($CASSANDRA_BIN/nodetool flush)
OUTPUT+=$($CASSANDRA_BIN/nodetool repair)
echo "done"
echo -n "Stopping Cassandra....."
OUTPUT+=$($CASSANDRA_BIN/nodetool stopdaemon)
if [ "$?" -ne "0" ]; then
echo "FAILED"
echo
echo "$OUTPUT"
exit
else echo "done"; fi
echo -n "Starting Log Insight..."
OUTPUT=$(service loginsight start)
if [ "$?" -ne "0" ]; then
echo "FAILED"
echo
echo "$OUTPUT"
exit
else echo "done"; fi
exit
fi
done
echo "FAILED"
echo
echo "ERROR: Cluster is not fully up after two minutes"
$CASSANDRA_BIN/nodetool status | grep rack1
echo
echo -n "Shutting down Cassandra..."
OUTPUT=$($CASSANDRA_BIN/nodetool stopdaemon)
if [ "$?" -ne "0" ]; then
echo "FAILED"
echo
echo "$OUTPUT"
echo
echo "WARNING: Do not start Log Insight until Cassandra has been stopped!"
echo "In worst case, restart the virtual appliance."
exit
else echo "done"; fi
echo "done"
echo
echo "Ensure all other nodes in the cluster are online. You can run this script again to attempt a repair"
exit
}
func=$(echo $1 | awk '{split($0,a,"-"); print a[3]}')
$func $2 2>/dev/null
echo "USAGE: $0 [--status|--repair]"
exit

© 2015, Steve Flanders. All rights reserved.

2 comments on “Heads Up! Log Insight Fails to Start with: Cannot Connect

Know what? This just fixed my problem with vRLI 8.4. Crazy! Thank you!

It has been a few years, but the technology remains similar — I am glad this post helped!

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top