Note: See also Article 008.
Introduction
On a modern operating system, it is likely that some form of threat management system is installed. While in the past only antivirus and port-/application-based firewall solutions were common on hosts, intrusion prevention systems (IPS) are increasingly becoming the norm. These can perform traffic and behaviour analysis, and block suspicious activity.
To a threat management system, Redstor Pro’s behaviour can appear suspicious. Threat management systems can prevent Redstor Pro from backing up successfully.
It is important to understand how Redstor Pro can be disrupted by threat management systems, and what to look for when troubleshooting. This guide demonstrates a number of scenarios in general terms.
Scenarios
- Antivirus interferes with Agent installation
- Firewall/IPS prevents transmission to Storage Platform
- Firewall/IPS prevents communication between Redstor Pro processes
- General troubleshooting strategy for antivirus/IPS problems
A. Antivirus interferes with Agent installation
1. Installation interference
Symptom: Backups will not run and sometimes a connection to the Storage Platform cannot be established.
Example error message: Agent may halt indefinitely, e.g. “Building Selection List”.
Cause: This scenario typically occurs when antivirus exclusions have not been applied to exclude the Redstor Pro installation folder (default location: C:\Program Files\Attix5 Pro\Backup Client or Backup Client SE). The antivirus software scans the installation folder and incorrectly identifies the working files and folders, including those of the Java installation, as potential virus threats. It often deletes these or blocks access to them.
Solution: Exclude the installation folder from antivirus scans. Reinstalling the Agent and reconnecting to the backup account may be required to get the Agent back into a known good state.
2. Agent database lock and corruption
Symptom: The Agent reports a database error when trying to run a backup, or reports that the database is locked.
Example error messages:
- In the Redstor Pro GUI or backup log file: "file is encrypted or is not a database" or "Could not calculate differences between new and previous backup sets:database is full".
- In the Redstor Pro GUI or backup log file: "Unable to recover from exception in backup process".
- In the Redstor Pro GUI or backupservice.log file: “com.attix5.sqlite.SQLiteException: database is locked”.
Cause: The antivirus software has either scanned and corrupted the Agent database (backuplist.db), or has a file handle on the database, preventing the Agent from getting an exclusive lock.
Solution: Exclude the database folder from antivirus scans. It may be necessary to delete the backuplist.db file and reconnect to the backup account in order to get the last known good version of the Agent database.
3. Cache corruption
Symptom: Files are flagged as “trouble” by either the Agent or the Storage Platform, and larger-than-expected data transfers are occurring. Errors can also be seen during the adding/patching phase of a staged backup.
Example error messages:
- In the Agent log when patching: “Unable to patch file” and “Trouble file encountered and added as full file”.
- In the Agent log when updating the cache: “Patching cache file failed”, “Could not apply patch to file…”, “Could not move backup file to cache… reason: Trying to move a file to the cache that does not exist” and “Flagged file for full backup”.
- During a staged backup, in the adding/patching file stage, in the backupservice.log file: “WARN com.attix5.service.spcomms.StoragePlatformConnection – File not found in toBackup”.
Cause: Files in the cache or toBackup folders have been altered by the antivirus application, which incorrectly handled them as a threat. The cache file checksums either do not exist or no longer match the Agent database, cannot be patched, and so must be resent in full. This can result in larger-than-expected data transfers. Deletion of files in the toBackup folder can cause the backup to fail.
Solution: Exclude the cache and toBackup folders from antivirus scans. This will prevent the files from being altered and needing to be resent in full.
4. Antivirus interferes with file read and write operations
Symptom: File reads fail or are unusually slow. Some files within the VHDTemp and WindowsImageBackup are shown as failing to rename in the backupservice.log files, which causes the system state to fail.
Example error message: The Agent log may show “Unable to read file”. Slow backups will not show this error, but the time taken to process each file will be noticeably slower than normal.
The following may be seen in the backupservice.log when system state rename failures are occurring:
- “Could not rename RenamedFolder”
- “WBAdminPlugin - Could not delete E:\WindowsImageBackup\SERVERNAME\SystemStateBackup\RenamedFolder\VHD-Hex-ID” (e.g. 5106e16c-60d3-11de-ae0d-806e6f6e6963)
- “WBAdminPlugin - Could not rename E:\WindowsImageBackup\SERVERNAME\SystemStateBackup\RenamedFolder to E:\WindowsImageBackup\SERVERNAME\SystemStateBackup\Backup YYYY-MM-DD HHMMSS” (e.g. 2013-28-02 093759)
Cause: File reads and writes are being monitored by antivirus on-access scanning, which can cause read failure or slow backup speeds. The failure to rename is caused by the antivirus software holding the file open during this process.
Solution: Exclude the installation folder and Redstor Pro Agent service (a5backup.exe or a5backup64.exe on 64-bit machines) from antivirus scanning. Also ensure WindowsImageBackup and VHDTemp are excluded from scanning.
5. Communication error during restore
Symptom: During the Receiving files process of the restore, the Agent receives a communications error.
Example error message: "Error: 14:17:11 Communications error: bad record MAC."
Cause: Certain antivirus/security applications intercept and sometimes modify SSL packets, leading to a "bad MAC record" error (e.g. Webroot SecureAnywhere, supplied by some online banking sites).
Solution: Uninstall/disable the antivirus application. Alternatively, resuming the restore works in most cases.
B. Firewall/IPS prevents transmission to Storage Platform
1. Agent cannot contact AccountServer
Symptom: The Agent will not authenticate against the AccountServer. The AccountServer can be contacted by the Storage Platform Communication Test tool or telnet. The Agent reports a read I/O failure, and does not proceed to send data to the StorageServer. Multiple retries may be seen.
Example error message: In the Agent log, when initiating a connection: “Cannot connect to Storage Platform: Connection refused: connect” or “IOException connecting to Storage Platform: Connection timed out: connect”.
Cause: Traffic from the Agent to the AccountServer is intercepted and blocked or dropped by a firewall or IPS, as it believes it to be suspicious.
Solution: Ensure that traffic from the Redstor Pro Agent service to the AccountServer is permitted in the firewall or IPS rules.
2. Agent cannot contact StorageServer
Symptom: The Agent does not send data to the StorageServer, or stops sending after a period of time. The Agent reports a read I/O failure. Multiple retries may be seen. Backups do not complete. In some circumstances, only small file selections will back up successfully.
Example error message: In the Agent log when initiating a connection: “Could not create backup: Failed to initiate streaming backup” or “IOException connecting to Storage Platform: Connection timed out: connect”.
If a connection drops unexpectedly: “Backup transfer failed”, “sendBackup could not send backup file: Exception in writer thread: Connection reset by peer: socket write error” or “sendBackup could not send backup index file: Connection reset by peer: socket write error”.
Cause: Traffic from the Agent to the StorageServer is intercepted and blocked or dropped by a firewall or intrusion prevention system, as it believes it to be suspicious. The drop may not happen immediately, but after a period of time or volume of data.
Solution: Ensure that traffic from the Redstor Pro Agent service to the AccountServer is permitted in the firewall or IPS rules.
Note: There are other issues that produce the same or very similar symptoms. Speed and duplex settings mismatches or errors can cause similar network traffic drops. Also, ISP-based traffic shaping or throttling can prevent effective communication. These causes should not be ruled out when investigating AccountServer and StorageServer communication issues.
C. Firewall/IPS prevents communication between Redstor Pro processes
1. System tray cannot open Agent GUI
Symptom: When right clicking the system tray icon and clicking Open, it is reported that “The Backup Service is not running.” However, on inspecting the Windows Services menu, the backup service is running. Restarting the system tray application does not resolve the problem.
Example error message: “The Backup Service is not running."
Cause: A host firewall/IPS is blocking communication between the system tray application and the backup service. Believing that the service has stopped, the system tray does not start the Agent GUI.
Solution: Ensure that the system tray application (A5Loader.exe on the Desktop and Laptop Edition, SERunner.exe on the Server Edition) and backup service (a5backup.exe, a5backup64 on 64-bit machines) are permitted in the firewall or IPS rules.
2. Agent GUI and service cannot communicate
Symptom: The Agent GUI does not load the backup selection, does not save settings when trying to close, or is unresponsive. The backup service is running.
Note: This behaviour is similar to the backup service having stopped.
Example error message: Might not display an error message. The GUI may fail to load the backup selection tree from the service. For example:
Cause: A host firewall/IPS is blocking communication between the Agent GUI and the Agent service.
Solution: Ensure that the Agent GUI (javaw.exe) and Agent service are permitted in the firewall or IPS rules.
3. Backup service and Exchange agent service cannot communicate
Symptom: When running an Exchange SIR Plus backup or during configuration, a communication error is shown.
Example error message: In the backup log file: “Could not communicate with the Exchange Agent: Error in call with Exchange Agent, Connection refused: connect.”
Cause: A host firewall/IPS is blocking communication between the Agent service and the Exchange Agent service (A5EA.exe) used by the SIR Plus plugin.
Solution: Ensure that the Exchange Agent service and Agent service are permitted in the firewall or IPS rules.
4. SP Console and Agent remote management cannot communicate
Symptom: The SP Console cannot connect to the remote management service of an Agent.
Example error message: In the Console: “The operation has timed out.”
Cause: A host firewall/IPS is blocking communication between the Console and the Agent service.
Solution: Ensure that access to the remote management port (default 9091) of the Agent service is permitted in the firewall or IPS rules.
Note: When connecting over a network, bear in mind that network-based firewalls and IPS systems can also prevent remote management connections from being established. Additionally, any Agents sitting behind network address translation will require port forwarding to be configured.
D. General troubleshooting strategy for antivirus/IPS problems
If the Redstor Pro software is not behaving as expected, antivirus or intrusion prevention software (IPS) is usually the cause. Possible interactions also exist with third-party software such as disk optimisation and monitoring software.
We recommend the following steps to troubleshoot the issue:
1. Apply antivirus exclusions
- Apply these exclusions for Redstor Pro folders and executables to the antivirus software.
- Test the results after each exclusion and repeat the process until all reasonable configurations have been tried.
Note:
- Often on-access and scheduled exclusions are separated, and may need to be configured separately. Similarly, process and file access may require separate configuration.
- Due to the large variety of antivirus/IPS software, exclusion requirements may vary greatly.
2. Eliminate known third-party software
- Possible interactions exist with third-party software such as disk optimisation and monitoring software. If exclusions can be applied, exclude Redstor folders and processes (as per Article 008) or disable the software.
Note: Other third-party backup applications running concurrently with Redstor Pro can also cause problems and could be mistaken for antivirus behaviour. - Test the result after each action.
3. Disable the antivirus
- If Steps 1 and 2 fail, identify and disable any other active antivirus software.
- Disable the main antivirus altogether, testing the outcome between actions.
4. Uninstall all antivirus software
- Temporarily remove the antivirus software to rule out other antivirus processes running in the background that are not subject to configurable exclusions.
Note: Redstor has found that some antivirus software disregards exclusions and continue to run anyway.
5. Eliminate unknown third-party software interactions
An unknown piece of third-party software is interacting with Redstor Pro. We recommend a process of elimination in a lab environment where problems can be recreated and software uninstalled.
Comments
0 comments
Article is closed for comments.