Backup & Disaster Recovery How-tos & Troubleshooting

Is data on Backup & Disaster Recovery appliance encrypted?

Hard disks where Backup & Disaster Recovery appliance stores backups are encrypted to protect data from being accessed offline.

See details.

No, the system does not support file transferring over the VNC connection to or from a virtual machine booted on the Backup & Disaster Recovery appliance.

The highest setting possible for Jumbo Frames is 8900.

Please note that this will work on NIC1, but not on NIC0 (the two interfaces have different e1000 chipsets and also have different feature sets). No errors shown if it does not work when changing MTU size. The MTU value simply goes back to the default on reloading the page if the setting did not take.

If you are currently using Nic0:

  1. Transfer settings from NIC0 to NIC1.
  2. Set your MTU on NIC1.
  3. Enable NIC1.
  4. Disable NIC0.
  5. Submit and activate configuration changes.
  6. Restart the appliance to make sure the changes applied.

By default, each Backup & Disaster Recovery appliance is provisioned with a predefined RAID level.

See details.

The disaster recovery backup agent does not have to be run as a system or domain admin on your computer, but you do need to make sure the agent will have sufficient access to the files it needs to back up, otherwise you will get Access Denied errors or warnings in the message logs of the backup.

There are two options for successful backups:

  • Run the agent as an admin (easiest)
  • Create a specific user that will give the Backup & Disaster Recovery appliance needed access for the backups.

This may take some trial and error by viewing the message logs to make sure there are no files that are denied access, but it is a possibility if you do not wish to have the agent running as an admin.

  1. In the Management Console of the Backup & Disaster Recovery appliance, go to Clients › Summary.

  2. Right-click the client you want to delete, and then click Edit.

  3. Click Actions, and then click Edit.

    This will take you to Clients › Edit.

  4. Click Delete, and confirm the deletion.

    Wait a few minutes to allow the appliance to complete the process.

  5. Click Activate configuration to apply changes.

To restrict access from the Infrascale Dashboard to Backup & Disaster Recovery appliance:

  1. Log in to the appliance Management Console, and go to the Settings tab.

  2. In the Security group, click Remote Access.

  3. Configure the remote access settings as desired.

A DR image backup is a type of backup designed for quick recovery of your critical systems.

DR image backups allow administrators to boot full systems on the appliance or in the cloud without the need for a boot disk.

Things to know

  • VMware environments do not require installing the DR imaging software when configured from the host level.

  • Hyper-V environments require installing the DR imaging software on the Hyper-V host.

  • DR image backups can be run on live systems.

  • Incremental backups are converted into the synthetic full backups providing independence from the rest of the incremental backup chain in the event of a full restore.

  • If you have other backup jobs on this machine, you may want to configure a new client just for the DR image backups, as you may want different retention policies to be set.

What you need

  • Access to the machines you want to set up a DR image backup job.

  • A configured client for the machines, for which you want to set up a DR image backup job.

Step 1. Install the DR imaging software

  1. In the appliance Management Console, go to Settings › Downloads.

  2. Download the DR Backup Agent.

  3. Launch the DR Backup Agent on the machine you want to protect.

    Installation is automatic and requires no additional steps.

Step 2. Configure a backup client

  1. In the appliance Management Console, go to Clients.

  2. Right-click the client with the installed DR backup agent, and then click Edit.

  3. Configure the backup schedules.

  4. Configure the retention policy.

  5. Enter an email address to receive notifications.

  6. Optionally, identify scripts to execute before or after the backups.

  7. Configure the networking options.

Introduction

We have put together the information below to help you understand how SQL Server can be scripted to create a backup or “flat” file.

The concept is that you create a SQL script to start the SQL backups, then run that script from the Run Before command option in the appliance Management Console › Clients › Edit. When the backup begins, we can then capture the SQL backup file created and saved to a certain location by your script file.

Create a script file

Using a simple text editor, create a file with a descriptive name like: SqlDatabaseBackup.sql. For this example, we will save the file on C disk, so the full path would be C:SqlDatabaseBackup.sql.

The contents of this file are explained below.

Database full backup

Regardless of how many files your database uses, you can create a complete backup of your database with one simple command:

BACKUP DATABASE [dbname] to [backup_device] 

For example, BACKUP DATABASE pubs to disk = 'c:mssqlbackuppubs.bak' creates a complete backup of the pubs database using the file c:mssqlbackuppubs.bak.

The database can remain online and accessible to users while this backup is being made. To take a consistent snapshot of the database, a copy of the transaction log is also included with the database backup.

In the backup file, SQL Server stores the names and locations of the files actually used in the database. Upon restoring the database, SQL Server recreates all the necessary files, as many as there might be. A database thus restored is equivalent to the point in time that the backup finished. A complete database backup is very simple to execute and use in recovery. If your data does not change often, you might schedule a nightly full backup of your database. Even if you need more frequent backups, regularly scheduling a full database backup each hour might be sufficient (if your database is small enough).

So your SqlDatabaseBackup.sql file simply has one BACKUP DATABASE command for each database you wish to back up. Enter each command on a new line. Be sure to save each database BAK file in a directory included (either inherently or specifically) in your backup file set.

Finally, edit your SQL client on the appliance to include a Run Before Backup command. The command to run should be this: sqlcmd -i C:SqlDatabaseBackup.sql.

Make sure when you are backing up your SQL Server as a “flat” file that you are backing it up to the same file name each time. Using unique file names can cause problems with customers who use replication.

Prerequisites

There are two ways to backup an Oracle database, RMAN and VSS. For either, to be able to backup the database without having to stop access to it, you will need to put the database into the ARCHIVELOG mode.

If the database is not already in the ARCHIVELOG mode, setting that mode will require restarting the database.

Back up using RMAN

Create an RMAN script file to back up your database to a location on the server’s local drive:

run { 
allocate channel oem_backup_disk1 type disk format 'c:tempdb_backup%U';
backup as COPY tag '%TAG' database;
backup as COPY archivelog all not backed up delete all input;
release channel oem_backup_disk1;
}
run {
allocate channel oem_backup_disk1 type disk format 'c:tempdb_backup%U' maxpiecesize 1000 G;
backup as COPY tag '%TAG' current controlfile;
release channel oem_backup_disk1;
}

Then add to the client setup on the appliance a run before line that runs that script:

rman target "sys/a@orcl" cmdfile "c:rman_backup.txt" 

In this example, sys is the database username being logged in as, a is the password, and orcl is the name of the database being backed up.

Be sure to include the directory that RMAN will put the database backup into in the file set for the client.

In the example above, it is c:tempdb_backup.

Once the backup job has backed up the copy of the database, it can be deleted from the server. We can do this by setting Run After Backup. In the example above, it is del /Q c:tempdb_backup*.

Back up using VSS

First, install the Oracle VSS Writer service for the database. This is done with the oraVSSw command:

oraVSSw SID /I 

where SID is the database SID.

Be careful to get the SID right, as there will be no warning that you have it wrong, the service will be registered, but will not do anything.

Once the Oracle VSS Writer service has been registered, you should be able to see it in the services msc snap in services.msc.

The service will be listed as the SID added to its name.

By default, this service is configured to run under the local system user, but this user does not normally have adequate access to the database. You need to stop the service, right-click it, go to the Log On tab, and set the service to run as a user who is a member of the ORA_DBA group. Then start the service again. Check with the VSSadmin list writers command that the Oracle VSS Writer for your database is shown (if not, the user the service is running as might not have enough access to the database).

Set the client to include all non-excluded writers, and run the backup. This should backup the database. To restore the database, first you will have to stop it, then browse and restore the job you want to restore, select Oracle VSS Writer, and then click Restore. Then you will have to recover or start the database again.

These instructions use Shrew Soft VPN Client (Standard edition) for Windows as an example. If you use another VPN client, or another operating system, the routine may differ somewhat.

  1. In the Infrascale Dashboard, go to Disaster Recovery › Hybrid Cloud Status.

  2. On the DR Runs tab, find and click a DR run you need to connect to, then click View Details in the left pane that opens.

    The DR run must be booted, that is – must have the RUNNING status (see the Status column).

    DR run

  3. On the right side of the DR run details page, note Public IP and Pre-shared Key for VPN – these will be used later to establish the VPN connection.

    DR run details

  4. Download and install the VPN client.

  5. Open the VPN client (VPN Access Manager), and click Add (or Edit › Add).

    Add site

    The VPN Site Configuration dialog box opens.

  6. On the General tab, in the Host Name or IP Address box, enter the public IP address you have noted in the DR run details page.

    Host name or IP address

  7. On the Authentication tab:

    1. In the Authentication Method drop-down list, select Mutual PSK.

      Authentication method

    2. On the Credentials subtab, in the Pre Shared Key box, enter the pre-shared key for VPN you have noted on the DR run details page.

      Pre-shared key

  8. On the Phase 1 tab, in the Exchange Type drop-down list, select main.

    Exchange type

  9. Click Save to apply changes.

  10. Select the newly created configuration, and then click Connect.

    Configuration

    The VPN Connect dialog box opens.

  11. On the Connect tab, click Connect to establish the VPN connection.

    Connect to site

    The VPN client will show tunnel opened after it establishes the connection successfully.

  1. Log in to the Management Console of the Backup & Disaster Recovery appliance.

  2. Go to Jobs › Settings.

  3. In the Job Schedules drop-down list box, select Disabled.

  4. Click Apply, and then click Activate Configuration on the upper right.

This will turn off all scheduled jobs until you select Enabled again.

You can still start the jobs manually for an individual clients on the Clients › Summary tab.

Restoring a computers system state restores the computers registry, COM+ class database, boot and system files, and so on from a system state backup. A system state backup is performed when you select the VSS and System State options.

For an Active Directory server, the system state also includes the Active Directory domain controllers. You can also restore the domain controllers separately from the rest of the system state writers.

When you select a backup job and select Browse and Restore, the file browser for the backup job will appear.

When you restore the system state to a computer, it is important that you restore the entire folder of system state writers and not individual writers. Also, you must restore the system state to the specific computer that it was backed up from (or, at least, to a computer with the exact same configuration as the backed up computer).

In the Navigation Screen pane, expand the VSS folder and select the System State folder. This will restore all of the folders under the System State folder, as required.

If you attempt to select or clear individual folders on the Entry List pane (on the right side), you will see a warning about unexpected results from a partial system state restore. Make sure that you check the System State folder and that you leave the other folder selections unchanged (all subfolders checked).

Then click Restore on the file browser screen. The system state will be restored to the client computer.

Once you have started a system state restore, you must not interrupt it in any way. This includes canceling the restore job. Any interruptions have the potential to leave the system in an non-bootable/unusable state.

The following information applies both to the primary and to the secondary appliances.

By default, a backup job can be deleted automatically:

  • under the retention settings, or

  • if the respective client is deleted or unregistered, and the Backup & Disaster Recovery appliance was configured to delete the orphaned jobs.

To prevent automatic deletion of a backup job in those cases, you can pin the job as follows:

  1. In the appliance Management Console, go to Jobs › History.

  2. Right-click a successfully completed job, and then click Pin Job.

    Pin job

    Also, you can select a job, and then click Pin Job on the toolbar.

  3. In the Pin Job dialog:

    1. In the Date Until Pinned combo box, set the date, by which the job will stay pinned.

    2. In the Comments box, enter a comment, and then click OK.

Pinned jobs are indicated by a checkmark in the Pinned column. You can sort and filter the jobs by this status.

To unpin a job, right-click the pinned job, and then click Unpin Job.

To view or edit the details of a pinned job, right-click the pinned job, and then click Pin Info.

To reset Changed Block Tracking (CBT) for a VMware virtual machine (VM):

  1. In the Management Console of the Backup & Disaster Recovery appliance, go to Clients › VMware.

  2. Select a VM you want to reset CBT for.

    CBT can be reset only for those VMs that are registeredpowered ondo not have snapshots, and have CBT allowed).

  3. On the toolbar, click Advanced, and then click Reset CBT.

    Reset

After the system finishes resetting CBT, it will show a dialog box with confirmation.

Reset completed

If you lost or forgot the password to log in to the appliance Management Console, you can reset it. For this, you must have access to the appliance terminal.

8.7.0 and later

To change or reset the password on the appliance with firmware version 8.7.0 and later:

  1. Restart the appliance.

  2. Wait for the appliance to show the boot loader, and then press any key (except Enter and Return) to interrupt booting.

    You have 5 seconds to interrupt before the appliance continues booting. Otherwise, you will have to restart the appliance again.

  3. Select the option that starts with OS [version], and then press e to edit the commands executed before booting.

    Select boot option

  4. Go to the end of the line that starts with linux, enter rd.break, and then press Ctrl+X to start booting.

    Interrupt booting

    The appliance boots into the shell.

  5. Check the access mode on the /sysroot mount point.

    For this, run the following command:

    mount | grep sysroot 
    • If /sysroot is mounted in the read-only mode (see ro in the command output), remount it in the read-write mode.

      Read-only mount point

      For this, run the following command:

      mount -o remount,rw /sysroot/ 

      Remount mount point

    • If /sysroot is mounted in the read-write mode (see rw in the command output), continue with the next step.

      Read-write mount point

  6. Change the root directory.

    For this, run the following command:

    chroot /sysroot 

    Change working root directory

  7. Set the new password.

    For this, run the following command, and follow the on-screen instructions to set the password:

    passwd 

    Reset password

  8. Relabel the file system.

    For this, run the following command:

    touch ./autorelabel 

    Relabel file system

  9. Change the root directory, and then log out to continue with the normal boot.

    For this, run the following commands:

    exit 
    logout

    Wait until the file system relabeling completes. This might take some time depending on the system performance and the number of files.

    Back to normal boot

The appliance boots normally, and you can log in to the Management Console with the newly set password.

 

8.6.x and earlier

 

To change or reset the password on the appliance with firmware version 8.6.1 and earlier:

  1. Restart the appliance.

  2. Wait for the appliance to show the boot loader, and then press any key (except Enter and Return) to interrupt booting.

    You have 3 seconds to interrupt before the appliance continues booting. Otherwise, you will have to restart the appliance again.

  3. Select the option that starts with OS [version] (VGA), and then press e to edit the commands executed before booting.

    Select boot option

  4. Select the line that starts with kernel, and then press e.

    Select kernel

  5. Press Space, enter 1, press Enter, and then press b to start the shell.

  6. Set the new password.

    For this, run the following command, and follow the on-screen instructions to set the password:

    passwd 

    Reset password

  7. Restart the appliance.

    For this, run the following command:

    reboot 

The appliance restarts, and you can log in to the Management Console with the newly set password.

  1. Restart Backup & Disaster Recovery appliance.

    Once the appliance boots, it will show the login screen.

  2. Log in, and then select Configure Networking Option.

  3. Select Reset to Default.

    This resets the networking configuration.

  4. Change all values as necessary—that is, short name, domain name, and default gateway.

  5. Select Restart Networking.

  6. Proceed with the configuration further to configure IP address (static or via DHCP).

  7. Save changes, and then restart the appliance.

If you assigned a static address to the device, you will need to reboot the appliance to apply the correct address, but, during the initial setup the system will take a few minutes to set up the directory and storage structure and you will need to wait for this to finish before you reboot the appliance. The status of this process can be found in the system messages at the bottom of the screen. Do not restart the system until the UCAR setup is at 100%, and the system shows The system is ready. With all this done, please try to access the appliance management console through the web GUI.

If the backups seem locked or frozen (for example, not cleared when canceled), or if Backup & Disaster Recovery appliance acts strangely, or if you just need to stop all backup jobs immediately, you have to restart the backup service.

To restart the backup service on the appliance:

  1. Sign in to the appliance Management Console.

  2. Go to System › Director Status.

    The Director Status subtab shows the details about the RVX-Backup kernel version—that is, when the backup daemons started and how many jobs have run since then.

  3. Click Restart Backup Service.

    This will cause all running, importing, and deduping jobs to stop.

Sometimes, when a backup job was canceled or is hung, it still appears in Jobs › Active Jobs or Clients › Summary. If this is due to the server-side issues, it is best to stop and restart the service on the agent (server side).

You can stop and start again the backup agent service using the built-in service manager on your server.

If you are using a Windows client, start the backup agent, and on the Service Control tab, click Stop, and then click Start.

Because editing the registry cannot be done without some risk to the system, it is recommended to create a system restore point before proceeding.

The registry is backed up using a VSS writer. VSS writers will only be backed up if the file set on the client has the following setting enabled: Use Service with All Non-Excluded Writers (recommended).

The VSS Registry Writer backs up the hive files, which backs up the registry as a single unit. A hive is a logical group of keys, subkeys, and values in the registry that has a set of supporting files containing backups of its data. Each time a new user logs on to a computer, a new hive is created for that user with a separate file for the user profile. This is called the user profile hive. A user’s hive contains specific registry information pertaining to the user application settings, desktop, environment, network connections, and printers. User profile hives are located under the HKEY_USERS key.

Traditionally, the writer would be restored in its entirety. If you wish to restore only select portions of the registry, you will need to pull the desired portions out of the registry hive files.

  1. Obtain the hive files from the VSS Registry Writer:

    1. From Jobs › History, right-click the desired job and select Browse and Restore.

    2. Browse to VSS:/System State/Registry Writer/Registry/C:/Windows/System32/config/.

    3. Select the names or links of the various files (these are called subkeys) located within the configuration folder one by one.

      This will prompt you to download the files to the location your web browser uses for downloaded files.

  2. Extract the desired portions of the registry:

    1. Open the hive files by temporarily loading them onto the registry.

    2. Open the registry editor, select HKEY_LOCAL_MACHINE in the left panel.

      HKLM must be the active selection, or the option to Load Hive will not be available.

    3. Select File › Load Hive.

    4. Point to one of the Subkeys restored from the appliance and select Open.

      Different parts of the registry are located in different areas/subkeys. Determine where to find the registry key you wish to restore.

      The Registry Editor will ask for a name for the key to load the hive into. You can name it any unique string. For example, backup.

    5. Browse into the key to find the portion of the registry you wish to restore.

      Due to the limitations of the registry editor, you will not be able to drag and drop the key to where you want it. You must first export the file, then import it again.

    6. Right-click the desired key and chose Export.

      Save the file in a location you will be able to remember and with a name that you will recognize.

      After the desired keys are exported, you will need to close the hive file.

    7. Select the top key of the hive › File › Unload Hive › Yes when it asks if you are sure.

      First right-click the REG file created by the export above, and choose edit to open the file in notepad.

    8. From the edit menu, select find and replace.

      Here we will find the temporary location we had the hive file open at (in this example backup), and replace that with where the registry keys should be.

    9. Save and exit.

  3. Double-click the REG file you just edited and confirm all prompts to continue to import the file into the registry.

This test will only identify network bottlenecks and will not indicate how fast a backup job will run.

There are some instances where backups are running slow and is difficult to pin what the cause of the problem might be. This test will help to identify if the problem is in the network and routing.

To start this test, you will need to first SSH in to the Backup & Disaster Recovery appliance as the root user.

Once at the command line, make the appliance the listening server by running the following command:

iperf -s 

Next, connect to the client that is currently showing signs of being slow.

For clients running Windows:

  1. Download the iperf and move it to drive C.

  2. Open Command Prompt and go to the drive C:

    cd C: 
  3. Enter the following command:

    iperf -c <ipaddress > -t 60 

For clients running Linux, iPerf is usually available in any Linux distribution repository, and is often installed as part of the base installation.

To start the test, open the Shell, and enter the following command:

iperf -c <ipaddress > -t 60 

After performing the test, the Shell will show something like this after about 60 seconds:

C: > iperf -c 198.51.100.34 -t 60 
------------------------------
Client connecting to 198.51.100.34, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------
[132] local 198.51.100.79 port 56109 connected with 198.51.100.34 port 5001
[ID] Interval Transfer Bandwidth
[132]
0.0-60.3 sec 43.5 MBytes 6.05 Mbits/sec

This will give you the speed that it was able to transfer data back and forth between the two.

You can specify how long you want the time to run by changing the number after the -t. The longer the test runs, the better the results will your average speeds.

You can also flip the test to have the client be the server and the appliance be the client. This may show if there is a problem sending or receiving.

  1. In the Management Console of the Backup & Disaster Recovery appliance, go to the Settings tab.

  2. In the Basic group on the left, click Firmware Update.

    Firmware update

    • If you have Automatically download updates to appliance selected, click Install to prepare for the update.

       Install update 

    • If you have Automatically download updates to appliance cleared, click Download to save the update file to the appliance.

      Download update

      After the download is complete, click Install to prepare for the update.

  3. After the appliance is ready to install the update, click Reboot.

    Reboot appliance

After you update the appliance firmware, we strongly recommend to update the backup agents on the clients associated with the appliance. See Install DR backup agent for details.

Multiple physical network interfaces on the same subnet may cause routing issues.

Configuring multiple network devices to be on the same subnet is not supported.

The proper way to set up the Backup & Disaster Recovery applioance when using both NICs is to configure one NIC to use specific static routes, and any traffic on those routes will use that NIC. (You will need to add specific routes for the second NIC using Settings › Networking › Routes. The static routes will need to point at the next hop router on the second NIC network.) The other (second) NIC will use the default gateway, which indicates “everything not specifically going to the other NIC, will go here”.

Commonly used to backup NAS shares or partitions.

Example scenario

You have a network share on a NAS server you want to back up.

Infrascale disaster recovery backup agent cannot be installed directly on the NAS server because it runs an operating system, which is not supported by the backup agent.

In this case, you have to use UNC to back up the said share:

  1. Log in to the Management Console of the Backup & Disaster Recovery appliance.

  2. In Clients › Summary, add a new client or use an existing one.

  3. Edit the backup file set of the client to include the UNC path to the NAS share.

    The path must start with //?/UNC/. For example, //?/UNC/serve_name/share or //?/UNC/IP address/Folder_to_start_in.

    The backup agent must be configured to run as a user, who has access to this path rather than the default Local System. You can configure this either in the RVXConfig tool or in Services of the client computer. This account must also have backup operator permissions in the network.

    If the share you are backing up is not a Windows file server, you will need to select the File Set option to ignore Win32 attributes.

    If the job log contains ERR=00000522: A required privilege is not held by the client for each file, you have to select the File Set option to ignore Win32 attributes.

    A less optimal alternative is to set up a mapped drive from a Windows computer, and include that mapped drive as part of the backup for that Windows client. This method will require the backup agent to be manually started by the user as a program rather than as a service.

Error message

MSSQL: warning: explicitly included database 'XXX' on DEFAULT server instance appears to be offline, will try anyway 
MSSQL: error: backup of database 'XXX' on DEFAULT server instance failed unexpectedly: An exception occurred while executing a Transact-SQL statement or batch.: because Database 'XXX' cannot be opened because it is offline.

Error description

Databases and all of their dependencies must be online to be backed up through Microsoft SQL Server.

Any backup operation that implicitly or explicitly references data that is offline fails. Some typical examples include the following: You request a full database backup, but one file group of the database is offline. Because all file groups are implicitly included in a full database backup, this operation fails.

To back up this database, you can use a file backup and specify only the file groups that are online. You request a partial backup, but a read/write file group is offline. Because all read/write file groups are required for a partial backup, the operation fails. You request a file backup of specific files, but one of the files is not online. The operation fails. To back up the online files, you can omit the offline file from the file list and repeat the operation.

See further details.

Steps to resolve

Exclude offline databases from the backup using RVXConfig, or bring the selected databases and all of their dependents online before attempting backup.

Symptoms

Job begins, but does not get past VSS, nor does it seem to fail:

  • Job hangs with 0’s across the board when viewing Client › Properties for the job and virtually no message logs available; or

  • Job fails due to network connectivity (reset by peer).

Troubleshooting

Disable VSS entirely to confirm if this is a VSS issue or not.

If the backup completes with VSS disabled, attempt to run a backup with VSS enabled but Include Writers disabled. This will determine if it is all of VSS or just the writers that is causing the issues.

Also, try to run the following through the client diagnostics tab to view the writers. If it does not work (hangs up), try the same from the server itself.

vssadmin list writers

If running the above command hangs as well, try the following:

  1. Try rebooting the machine and see if that fixes the “hang” problem.

  2. If not, make sure the service is not disabled.

    1. Open Start Menu, right-click Computer, and then click Manage.

    2. Go to the Services panel (Services and Applications › Services).

    3. Find Volume Shadow Copy in the list.

    4. The Startup Type value should not be Disabled.

    5. Change Startup Type to Manual if necessary.

  3. Try restarting the service from the Properties dialog (see steps above).

    1. If service is running, click Stop to stop it.

    2. Click Start.

    3. Check that Service status goes from Starting to Started.

  4. Try running vssadmin list writers again.

Symptom

Backup job seems to hang up or gets stuck either in Jobs › Active Jobs, or in Clients › Summary under the Active Jobs column. Possibly other jobs are waiting behind these backup jobs.

Troubleshooting

  1. Go to Jobs › Active Jobs, right-click the job, and then click Client Properties.

    Note what file the job seems to be hung up on.

    It is possible that the job is performing very slowly rather than completely hung up. Monitor the Client Properties window for a moment to ensure the client is completely hung up.

  2. Go to Jobs › Active Jobs, right-click the job, and then click Message Logs.

    Note if there are any errors in the Message Logs.

  3. Go to Support, and review the server.xml log to see if there are any error messages associated with this client and backup.

  4. Restart the services on either the client or the appliance, or both.

    For more information on how to properly do this, see How to restart backup service on the appliance or agent.

If the client seems to hang up repeatedly:

Again, note which file the client seems to be hanging up on, especially if it is the same file each time. (See Client Properties above.)

  1. Go to Clients › Edit.

  2. Select the affected client, and then click the “pencil” button in the Backup group.

    Check to see if VSS and Include Writers are selected.

Other things to consider

Waiting for Media

Jobs › Active Jobs, see the Status column.

If status is Waiting for Media, right-click the job, and then click Message Logs.

Review any error messages present.

There is a possibility this could be tied to database problems. If so, contact Infrascale Support.

System State backup (ntbackup) running out of space. (This only affects older versions. Infrascale Disaster Recovery 3.4 and later uses VSS for System State Backup, not ntbackup.)

Jobs › Active Jobs, right-click the job, and then click Message Logs.

If the last message is: CLIENT-FileIO: WSS: System state will be saved to C:Program FilesRVX-Backupsysstatesystemstate.bkf.

Clients › Summary, right-click the client, and then click Properties.

If there is no current file (it may say ‘null’):

When looking at the Message Logs, if the client appears to be hanging in systemstate backup. The most likely cause is ntbackup running out of space, or some other error, and hanging. Please check the C: drive for space and the event log for other possible ntbackup errors. You can also log into session 0 and see if ntbackup has popped up an error dialog.

If you are unable to create enough space, you can disable System State backup using Configure Backup › System State, and clearing the option there.

Symptom

You want to restore a particular file to a server, but you cannot find it when browsing to the location it should be on the server using the “Browse and Restore” feature.

Steps to resolve

There are a few things that can cause a folder to go “missing” on the Backup & Disaster Recovery appliance.

File sorting confusion

Make sure you check the entire directory the folder is supposed to reside in. Check to see if perhaps the folder is sorted differently due to the capitalization of the folder name. Currently, the appliance will sort folders alphabetically using folders with capitalized names first, then alphabetically using folders with names in all lowercase. This can cause folders with lowercase names to be several pages later than those with capitalized names.

There have been many instances where a file appears to go missing, but it is actually being sorted differently due to capitalization.

VSS writers

Check to make sure the directory or program does not require the use of VSS Writers. Sometimes, a directory or program will require the data to be backed up using a writer because of certain features/functionality of that directory or program and must be backed up using VSS. This requirement is completely on the program/server side and is something we cannot control.

For example, if data is backed up from a DFS replication share, it must be backed up via the DFS replication service writer.

The files backed up using writers will not be located under the drive they appear to be under on the server. Instead, they will be under the VSS folder in the backup. The writer must be restored as a whole unit because it must be restored using the VSS writer.

This is only backed up if you have enabled VSS with “Use Service with All Non-Excluded Writers (Recommended)” enabled in the file set. If this setting is not selected, the data will not be backed up.

There may be a way to make a “flat file” copy of the DFS share to another location and back that up without VSS or writers enabled. This would allow single file restores similar to typical flat file backups and may be the simplest way to get around the writer restore limitations.

One other option would be to write before/after scripts to get the flat file copy during the regularly scheduled backup.

Symptom

  • A client does not run a scheduled backup

  • A manual backup appears to run successfully

  • No scheduled backup appears in Jobs › Scheduled Jobs for the affected client

Things to check

  • Make sure the client has a schedule enabled under Clients › Edit (you can also see this information in Clients › Summary under the Schedule Column)

  • Look over the schedule in Settings › Backups › Job Schedules to confirm the schedule has the proper schedules configured for backups

  • Check the appliance time in Settings › Basic › Appliance Status, and change it to match the servers if needed by going to Settings › Basic › Date/Time

Steps to resolve

This issue is commonly attributed to something getting hung up during the initial client configuration.

If none of the above steps resolve the issue, it is recommended to create a new client and test to see if the same behavior is observed with the new client.

Also, you may try applying the different schedule to the client, click Apply, then Activate configuration, apply the required schedule, and activate it.

Issue

In the appliance Management Console › Clients › Summary, the client shows a white and red triangle indicating that the disaster recovery backup agent is running but is unable to communicate.

Issue description

This is almost always a problem with the passwords not matching. The password in the configuration file (also displayed in the RVXConfig tool) must match the password in the appliance Management Console › Clients › Edit.

Steps to resolve

Confirm that the passwords match and then restart the service.

Another way to confirm the passwords match between the client and the agent is to download the configuration from the client and apply it to the agent.

  1. In the appliance Management Console, go to Clients › Summary.

  2. Right-click the client, and then select Download Config in the context menu.

  3. Copy the downloaded configuration file bacula-fd.conf to C:Program FilesRVX-Backup.

Possible alternative causes

A firewall or antivirus is affecting communication.

Please confirm no outside factors are blocking communication between the appliance and the client.

Rarely, the default port (9102) is not working, and it is necessary to change this on both the RVXConfig side, as well as in Clients › Edit.

Try changing the port to 9204 (for example).

If none of the above options help, there could have been a problem with the initial installation of the agent. Uninstall and install the agent again.

Details

Status iconStatus label
Status iconClient is down or unreachable

This status indicates one of the following immediate issues has emerged:

  • client is turned off

  • client is not on the network

  • firewall is blocking the bacula-fd service

  • configuration file is missing

  • backup agent is not installed

Example

Client status

Troubleshooting

To resolve the issues that raised this status, verify and fix (if needed) the following:

Ports 9102 and 9103 are open for the affected client

Check your firewall and antivirus settings. Also, check if other third-party applications use these ports

Services are up and running on the affected client

  1. Open the Run command (press Win+R on the keyboard).

  2. Enter services.msc to open Windows Services Manager.

  3. Check if the following Infrascale services are running (see the Status column).

    Service list

Address not resolved
This status is typically resolved by correcting the name in the address field. Basically, Backup & Disaster Recovery appliance is unable to resolve the name to an address and so we will not be able to communicate. You might confirm that the name is actually resolvable over the network, you may also replace the name with the IP address. Additionally, you may need to confirm the correct DNS servers under the Settings tab.
Agent not running

The first step here is to confirm if the backup agent is actually running or not. The Windows Services tool is a great place to check this. You could also check by using the “Configure Backup” tool accessed via the start menu.

If the agent claims to be running, recommend restarting the service/agent.

If the agent still will not start, confirm that the configuration file is in place.

Use the RVXConfig tool and run the wizard.

Can the client and appliance ping each other?

Check the event logs for any errors regarding the RVXBRagent starting or stopping.

If the client is a Windows 2008 machine and there is a message that reads “Service is marked as an interactive service. The system is configured to not allow interactive services. This service may not function properly.”

Please see Microsoft knowledge base article.

Verify, in the Task Manager, that RVXBRagent and Bacula are running. You will need to check that you want to show processes from all users.

Confirm that the port in use by the agent (default 9102) is not blocked in any way.

Firewall? Antivirus? Other programs? Unusual networking?

If an exception needs to me made in the firewall, our RVXBRagent needs port 9103 and the Bacula process needs port 9102.

Using a telnet command, on the Support tab, can help determine if something is blocking the port that the client is running on telnet [client address] [port] (for example, 9102).

You might simply try uninstalling and installing the backup agent again.

If the agent in question is a few revs old, try installing a newer version.

Escalate and document any new discoveries.

Running but cannot communicate

This is almost always a problem with the passwords not matching. The password in the configuration file (also displayed in the RVXConfig tool) must match the password in Clients › Edit. Confirm that the passwords match and then restart the service.

If that does not do the trick, there could also be a firewall or antivirus in the way. Please check to see if anything is blocking traffic.

Sometimes the default port (9102) is not working and it is necessary to change this in both RVXConfig and Clients › Edit (for example, 9204).

Lastly, if none of this helps, the installation could have issues and you may need to just uninstall and install it again.

Down or unreachable
This status may be the result of a number possibilities, it may be that the client machine is not actually connected to the network or not powered on at all. This may also be a problem with the Windows firewall blocking the bacula-fd service. An exception to the firewall is usually the recommended solution. This status may also be caused if the configuration file is not in place at all or if the agent software has not been installed. Ensure that everything is installed as it should be and restart the service.

Error message

bsock.c:237 gethostbyname() for host "XXXX" failed: ERR=Authoritative answer for host not found. 

Error description

This error indicates that the IP address of the client (see in Clients › Edit) is not resolvable. By default, the Backup & Disaster Recovery appliance uses the client name as the address, and this field needs to be adjusted.

Steps to resolve

This error can be resolved in two ways:

  • Verify that the host name is resolvable via DNS, and make adjustments as needed.

  • Change the address to the IP address rather than the host name.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following DR image backup error:

DR Image Backup Failure: Failed to create SuperAgent remote proxy 

Error description

The error refers to miscommunication between the backup agent installed on the client and the associated appliance.

It can show after updating the backup agent, that is when the previous backup service had not stop before the new one started.

Steps to resolve

If the error relates to a VM client, perform the steps for the associated host.

Uninstall backup agent and install it again

  1. Stop all related Infrascale services running on the client:

    • Infrascale Backup and Restore Agent (64-bit)

    • Infrascale Installation Client

    • Infrascale Remote Management

    If a service got stuck in the Stopping state, this indicates possible issues with the operating system. Restart the client and try stopping the service again.

  2. Uninstall all related Infrascale software from the client.

  3. Make sure the following folders were deleted, or delete them manually:

    • C:Program FilesInfrascale

    • C:Program FilesRVX-Backup

    • C:Program Files (x86)Infrascale

    • C:Program Files (x86)RVX-Backup

  4. Download the backup agent installer from the appliance Management Console › Settings › Tools › Downloads.

  5. Run the installer and follow the on-screen instructions to install the latest version of the backup agent.

  6. Restart the client, and then check if all related Infrascale services you stopped earlier are running again on the client.

 

Change client operating system

  1. In the appliance Management Console, go to Clients › Edit.

  2. Select the client with the error.

  3. In the General group, change the operating system to any other.

  4. Click Apply to save changes, and then click Activate Configuration on the upper right.

    Client editing

  5. Change the operating system back again.

  6. Click Apply, and then click Activate Configuration again.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error related to Microsoft SQL Server backup:

MSSQL: warning: cannot connect to explicitly configured 'MSSQL10.XXXX' server instance: Failed to connect to server XXXXMSSQL10.XXXX 

Steps to resolve

You see the MSSQL10 instance when you enumerate the SQL databases in the backup agent on the Microsoft SQL Server.

This is a false error message having to do with the Microsoft SQL Server plug-in searching for a database that does not exist.

In the backup agent on the Microsoft SQL Server, make sure this database is excluded.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error related to Microsoft SQL Server backup:

SNFCSQL2012-FileIO JobId 9929: MSSQL: error: can't get configuration for backup of database 'distribution' on DEFAULT server instance: GetConfiguration timed out after 300000ms 

Error description

The Microsoft SQL Server plug-in not being able to back up databases on Microsoft SQL server 2012. This appears to be an issue with the permissions of the account the agent is running as. The job does not fail, but completes with errors, and does not actually back up the SQL databases.

The requested backup cannot be completed by the SQL Server because the log chain has been broken. This is likely caused by a conflict from another tool or service.

Steps to resolve

It seems that Microsoft SQL Server 2012 has additional security requirements to enable Infrascale plug-in to run a backup that were not present in earlier versions of SQL. Depending on your SQL security settings there may be a number of places that you will have to change permissions to allow backups.

Since Infrascale backup agent runs as the local system account by default, the instructions below will reference the NT AuthoritySystem account. If you are using a different account to run the backup agent, then change these settings for that account. These instructions also assume that you have the backup agent set up according to the normal setup instructions, and also have the SMOs and the NCLI installed, but are getting the error above.

  1. In SQL Management Studio, go to Security › Logins.

  2. Under Login. right-click the NT AuthoritySystem account, and then select Properties.

  3. In the Properties window, click Server Roles.

  4. Make sure both the Sysadmin and Serveradmin roles are selected.

  5. Click OK to close the window, and then try new backup.

If you are still getting the error after setting the roles:

  1. Sign in to the SQL Management Studio.

  2. Right-click the server, and then select Properties.

  3. In the Properties window, select Permissions, and then select the NT AuthoritySystem account from the list in logins or roles.

  4. In the Permissions group, select Grant for all allowable permissions, verifying that none of these permissions have deny the selection.

  5. Click OK to close the window, and try new backup.

At this point, if the backup is still getting this error, then it is not likely a permission issue. Most of the time you can skip making the individual changes to the security settings above by setting the agent to run as your SQL administrator account. To do this log into the agent configuration on the SQL server, browse to the Service Control tab, stop the service, then enter the username/password of the account, then start the service again.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error if the appliance cannot connect to a backup client:

bnet.c_1011 Could not connect to storage daemon on CFA-IP:9103. ERR=Connection refused 

Steps to resolve

Our services are not running on the client. A firewall may be blocking the connection from the client to the appliance on port 9103.

A quick backup of a reduced file set (with No Change on Full enabled) should help diagnose if the problem still exists after firewall changes or other network fixes.

Generally this error means the Backup & Disaster Recovery appliance could not gain access to that file, or it does not exist, and the appliance did not back it up. This would be the case for any temporary files that existed at the time the backup began, but had been cleared out by the time the backup got to that specific file.

To eliminate that message from future backups, an exclusion can be created for that file or path from the backup file set or, if the file was already in the file set, its entry may need to be changed if the file had been moved.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance shows the following error:

backup.c:666 Network send error to SD. ERR=An existing connection was forcibly closed by the remote host. 

Error description

This can happen when the Storage Daemon shuts down due to being killed by a process, which looks for out-of-space issues. It can also be due to a network connectivity problem, in which the TCP stack on the appliance eventually closes the connection with a RESET.

Error message

FileIO JobId 1422: ESEBackup: failure getting log and patch files: c8000232 
Some log or patch files are missing.Error reading data header from FD. ERR=Connection reset by peer
c8000230: The database missed a previous full backup before the incremental backup 
c800020e: An incremental backup cannot be performed when circular logging is enabled 

Possible causes

Circular Logging is enabled. This is specifically recommended to be disabled for Exchange 2010, but appears to cause issues when backing up other versions of Exchange as well.

Circular logging has been enabled in the past but has since been disabled.

Another system is using VSS, backing up data or truncating logs.

Transaction logs have been manually deleted.

Steps to resolve

Disable circular logging if enabled.

Make sure you have disabled whatever else may be using VSS, backing up data or truncating logs.

Perform a full exchange backup using NTBackup. This will re-arrange the transaction logs making sure you will back up the correct ones in your next transaction log backup.

How to disable circular logging for Exchange 2000 and Exchange 2003

Click Start, go to Programs › Microsoft Exchange, and then click System Manager.

If the Administrative Groups branch exists in the left pane, expand it, expand the appropriate administrative groups branch, expand the Servers branch, and then expand the appropriate servers branch.

If the Administrative Groups branch does not exist, expand the Servers branch in the left pane, and then expand the appropriate servers branch. To expand a branch, double-click the branch or click the plus sign (+) to the left of the branch.

Right-click the storage group you want, and then click Properties.

To disable circular logging, clear Enable circular logging, and then click OK.

Restart the information store. For this:

Click Start, go to Programs › Administrative Tools › Services.

Click Microsoft Exchange Information Store in the right pane, and then in the Action menu, click Restart. If a dialog box appears stating that extra services will be restarted, click Yes.

The information store must be restarted because when it starts, it reads configuration information from Microsoft Windows 2000 Active Directory. The Active Directory attribute associated with the circular logging setting is called MSExchESEParamCircularLog.

When circular logging is enabled, this attribute is set to 1. When circular logging is disabled, it is set to 0.

How to disable circular logging for Exchange 2007

Start the Exchange Management Console.

In the console tree, expand Server Configuration, and then click Mailbox.

In the work pane, right-click the storage group, for which you want to enable or disable circular logging, and then click Properties. The Properties dialog box appears.

Select or clear Enable circular logging.

Click OK.

To make your changes to the circular logging settings effective, restart the Microsoft Exchange Information Store service, or dismount and then mount all databases in the storage group.

 

How to perform full Exchange backup using NTBackup on Exchange 2003

 

This will rearrange the transaction logs making sure you will back up the correct ones in your next transaction log backup. Please make sure all scheduled backups for this server are disabled while the NTBackup is running on this server. (Go to Clients › Edit. There is a drop-down list for Schedule › Select None.)

Click Start › Run, enter NTBACKUP and then click OK.

The backup and restore wizard appears. Click advanced mode on the welcome page.

Click the Backup tab.

On the backup tab expand server, expand servername, expand information store and select the correct storage group or entire information store.

Select file as destination, and name it so you can find and delete it later.

Click Start backup.

This process will purge old transaction logs, making sure the errors above will be solved.

If you are able to get a successful backup via NTBackup, you should be able to get a backup via the appliance. It is recommended to get a full backup immediately following the successful NTBackup.

 

How to perform full Exchange backup using NTBackup on Exchange 2007

 

This will rearrange the transaction logs making sure you will backup the correct ones in your next transaction log backup. Please make sure all scheduled backups for this server are disabled while the NTBackup is running on this server. (Go to Clients › Edit. There is a drop-down list for Schedule, select None.)

Go to Start › Administrative Tools › Windows Server Backup.

In the Actions pane, click Backup once. The backup once wizard will appear.

On the backup options page, select different options and click next.

On the select backup configuration page, select Full server, and then click Next. (Use custom if you want to select individual volumes.)

On the specify advanced options page, select VSS full backup, and then click Next.

On the Confirmation page, review the backup settings, and then click Backup.

On the Backup progress page, you can view the status and progress of the backup operation. Click Close when the backup operation has completed.

If you are able to get a successful backup via NTBackup, you should be able to get a backup via the appliance. It is recommended to get a full backup immediately following the successful NTBackup.

Error message

A file access error occurred on the host or guest operating system. 

Error description

This error indicates that:

  • the Backup & Disaster Recovery appliance has issues communicating with the host of the virtual machine (VM) or with the datastore; or

  • the VM has no VMware Tools installed, or VMware Tools lacks access permissions.

Steps to resolve

First of all, run full backup of the affected VM manually:

  1. In the appliance Management Console, go to Clients › VMware.

  2. Select the affected VM.

  3. On the toolbar, click Manual Backup.

  4. In the open dialog box, select Full, and then click OK.

If previous does not help, register the affected VM again:

  1. In the appliance Management Console, go to Clients › VMware.

  2. Right-click the affected VM, and then click Unregister.

  3. Click Activate configuration.

  4. Right-click the affected VM again, and then click Register.

  5. Click Activate configuration.

  6. Run backup of the affected VM manually.

Also, check if the affected VM has the latest version of VMware Tools installed. If so, check if VMware Tools has all permissions required for backup.

Error message

Filesystem change prohibited. Will not descend into or `filepath` is a different filesystem. Will not descend from. 

Steps to resolve

This error means that the file is on a different file system than the current one.

Sometimes this can mean that the path following the ends in a symlink or a junction point would lead the backup process away from the current path, and thus may result in skipping all over the file structure, which would lead to skipping lots of data.

We do not follow such links away from the path we are currently backing up, we systematically traverse the file structure so as to not miss anything.

There is a setting to change this under Clients › Edit, but it is prudent to not cross file systems while doing a backups and to instead list them each explicitly.

Error message

MSSQL: error: can't get configuration for backup of database `XXX` on DEFAULT server instance: GetConfiguration timed out after 300000ms MSSQL: backup of database `XXX` on DEFAULT server instance failed 

Check debug-level job message log for:

MSSQL: information handler (instance name , db name XXX): BACKUP LOG cannot be performed because there is no current database backup. 
MSSQL: information handler (instance name , db name XXX): BACKUP LOG is terminating abnormally.

To set a client debug level, right-click client from Clients › Summary subtab and select Advanced › Set Debug Level. Enter a larger value (such as 100) in the Debug Level text box, and then click OK. Run offending job again with new debug level to get extra log messages.

Check Microsoft SQL Server error log for:

Error: 3041, Severity: 16, State: 1. BACKUP failed to complete the command BACKUP LOG Connect. Check the backup application log for detailed messages. 
BACKUP LOG WITH TRUNCATE_ONLY or WITH NO_LOG is deprecated. The simple recovery model should be used to automatically truncate the transaction log. 

If you see this message in the error log, take note of the time(s) it has occurred. The time(s) will be useful when reviewing maintenance tasks later.

To find the ERRORLOG file, look in the LOG folder of a particular SQL Server instance. This is usually something like C:Program FilesMicrosoft SQL ServerMSSQL.xMSSQLLOG where MSSQL.x is the instance ID for the SQL Server instance.

Check Windows Application Event Log on client for:

Error 12/15/2009 5:50:04 AM MSSQL$INSTANCENAME Backup 3221228513 NT AUTHORITYSYSTEM BACKUP failed to complete the command BACKUP LOG XXX. Check the backup application log for detailed messages. 
NEED TO ADD ENTRY SHOWING "BACKUP LOG WITH TRUNCATE_ONLY or WITH NO_LOG" MESSAGE!!! 

Description

The requested backup cannot be completed by the SQL Server because the log chain has been broken. This is likely caused by a conflict from another tool or service, including maintenance tasks scheduled through the SQL Server Management Studio and the associated Maintenance Wizard. NTBackup can also cause this issue if it is enabled and doing a system state backup (we have an upcoming NTBackup plug-in scheduled for 3.4 that will negate this problem).

Steps to resolve

Remove conflicts that are causing the database log chain to become broken. Usually, the maintenance tasks scheduled for the server being backed up should be reviewed and removed (or inactivated) if determined to be at fault. Follow the steps below to review or remove maintenance tasks using SQL Server Management Studio:

  1. Open SQL Server Management Studio on server having the backup problems.

  2. Locate the instance containing the “problem” database(s) inside the Object Explorer pane (usually along the left side of the application window).

  3. Find the SQL Server Agent folder for the located instance.

  4. Make sure the Object Explorer Details tab is open in the main display area and click the Jobs folder under SQL Server Agent.

    Look for offending jobs such as Truncate XXX Transaction Logs. This particular Maintenance Task causes log incremental backups to fail.

    Be particularly suspicious of jobs that are scheduled to run at the same time as the messages in the SQL error logs whose time you noted from the steps above.

  5. Disable or delete offending jobs. You can disable a job by editing the job properties.

  6. Immediately run a full backup on the database(s), for which offending jobs were disabled or deleted.

    This will prevent loss of transaction data from the lack of successful log/incremental backups.

If this is NTBACKUP-related, the only way to stop NTBACKUP from doing this is to disable to VSS writer responsible. The user will have to disable the MSDE Writer or the SQL Server Writer (their OS/SQL version determines, which one they will have).

Error message

VixEOUTOFMEMORY: Memory allocation failed. Out of memory. 

Error description

Failed VMware VM backup due to NBD limits

Beside limitations on connections count to VC/ESXi, ESXi itself also limited by a transfer buffer for all NFC connections. This limitation is enforced by the host and cannot by bypassed or to be known in advance. The sum of all NFC connection buffers to an ESXi host cannot exceed 32 MB, and by default it is configured as 16 MB.

The primary physical appliance uses NBD protocol to backup VM disks. NBD, in turn, employs the VMware network file copy (NFC) protocol and thus is a subject of aforementioned limitations.

This happened because the ESXi server could not serve the request due to lack of enough resources (NFC connection buffer).

At the time there were N failed jobs with VixEOUTOFMEMORY error you had N+1 or more simultaneously running FULL backups. All of them were backing up VMs located on a single ESXi host. We use up to 10 MB buffer to transfer data. So there is a probability of facing with NFC buffer limitation on ESXi host. Which, as you saw, happened to occur. It does not mean the probability is always 100% with parallel backup jobs. It very depends on a lot of factors.

Steps to resolve

Nothing if there are no other failed backups with VixEOUTOFMEMORY error on your appliance on consequent backups.

You can also optimize the ESXi network (NBD) performance by increasing the NFC buffer size from 16 MB to 32 MB and reducing the cache flush interval.

Do it on all of your ESXi hosts. You can query current values using the following commands (from ESXi host): esxcfg-advcfg -get /BufferCache/MaxCapacity and esxcfg-advcfg -get /BufferCache/FlushInterval. It will not guarantee the VixEOUTOFMEMORY error never happen again but will decrease its probability. And it seems to be a good idea in general since you perform a lot of simultaneous backups.

Consider upgrading your network to 10 GbE. That should cover every network link in the chain between the appliance and the VMware host.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance shows the following error:

The method is disabled 

Error description

This error may show when you try to migrate a VMware VM from one host to another.

This is a known issue in VMware. Sometimes after a backup runs, VMware will not release the migration lock even though it was asked to do so. This can happen if the backup fails, but will also sometimes happen without failed backups.

Steps to resolve

To resolve this issue, you have to manually remove the lock.

Error message

FileIO JobId 1422: ESEBackup: failure getting log and patch files: c8000232 
Some log or patch files are missing.Error reading data header from FD. ERR=Connection reset by peer
c8000230: The database missed a previous full backup before the incremental backup 
c800020e: An incremental backup cannot be performed when circular logging is enabled 

Possible causes

Circular Logging is enabled. This is specifically recommended to be disabled for Exchange 2010, but appears to cause issues when backing up other versions of Exchange as well.

Circular logging has been enabled in the past but has since been disabled.

Another system is using VSS, backing up data or truncating logs.

Transaction logs have been manually deleted.

Steps to resolve

Disable circular logging if enabled.

Make sure you have disabled whatever else may be using VSS, backing up data or truncating logs.

Perform a full exchange backup using NTBackup. This will re-arrange the transaction logs making sure you will back up the correct ones in your next transaction log backup.

How to disable circular logging for Exchange 2000 and Exchange 2003

Click Start, go to Programs › Microsoft Exchange, and then click System Manager.

If the Administrative Groups branch exists in the left pane, expand it, expand the appropriate administrative groups branch, expand the Servers branch, and then expand the appropriate servers branch.

If the Administrative Groups branch does not exist, expand the Servers branch in the left pane, and then expand the appropriate servers branch. To expand a branch, double-click the branch or click the plus sign (+) to the left of the branch.

Right-click the storage group you want, and then click Properties.

To disable circular logging, clear Enable circular logging, and then click OK.

Restart the information store. For this:

Click Start, go to Programs › Administrative Tools › Services.

Click Microsoft Exchange Information Store in the right pane, and then in the Action menu, click Restart. If a dialog box appears stating that extra services will be restarted, click Yes.

The information store must be restarted because when it starts, it reads configuration information from Microsoft Windows 2000 Active Directory. The Active Directory attribute associated with the circular logging setting is called MSExchESEParamCircularLog.

When circular logging is enabled, this attribute is set to 1. When circular logging is disabled, it is set to 0.

How to disable circular logging for Exchange 2007

Start the Exchange Management Console.

In the console tree, expand Server Configuration, and then click Mailbox.

In the work pane, right-click the storage group, for which you want to enable or disable circular logging, and then click Properties. The Properties dialog box appears.

Select or clear Enable circular logging.

Click OK.

To make your changes to the circular logging settings effective, restart the Microsoft Exchange Information Store service, or dismount and then mount all databases in the storage group.

 

How to perform full Exchange backup using NTBackup on Exchange 2003

 

This will rearrange the transaction logs making sure you will back up the correct ones in your next transaction log backup. Please make sure all scheduled backups for this server are disabled while the NTBackup is running on this server. (Go to Clients › Edit. There is a drop-down list for Schedule › Select None.)

Click Start › Run, enter NTBACKUP and then click OK.

The backup and restore wizard appears. Click advanced mode on the welcome page.

Click the Backup tab.

On the backup tab expand server, expand servername, expand information store and select the correct storage group or entire information store.

Select file as destination, and name it so you can find and delete it later.

Click Start backup.

This process will purge old transaction logs, making sure the errors above will be solved.

If you are able to get a successful backup via NTBackup, you should be able to get a backup via the appliance. It is recommended to get a full backup immediately following the successful NTBackup.

 

How to perform full Exchange backup using NTBackup on Exchange 2007

 

This will rearrange the transaction logs making sure you will backup the correct ones in your next transaction log backup. Please make sure all scheduled backups for this server are disabled while the NTBackup is running on this server. (Go to Clients › Edit. There is a drop-down list for Schedule, select None.)

Go to Start › Administrative Tools › Windows Server Backup.

In the Actions pane, click Backup once. The backup once wizard will appear.

On the backup options page, select different options and click next.

On the select backup configuration page, select Full server, and then click Next. (Use custom if you want to select individual volumes.)

On the specify advanced options page, select VSS full backup, and then click Next.

On the Confirmation page, review the backup settings, and then click Backup.

On the Backup progress page, you can view the status and progress of the backup operation. Click Close when the backup operation has completed.

If you are able to get a successful backup via NTBackup, you should be able to get a backup via the appliance. It is recommended to get a full backup immediately following the successful NTBackup.

Symptoms

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following backup error:

Network error on data channel. ERR=Connection reset by peer 

Troubleshooting and resolution

  1. The client may have been down or otherwise unreachable due to any number of networking problems:

    • NICs may be incorrectly configured on the appliance.

      Check that a disabled interface is not configured with the same IP address as the enabled interface.

    • Dual NIC may be incorrectly configured on the client.

      Possibly a long-running test from the appliance could diagnose this.

  2. The connection may drop due to unreliable network component in between the appliance and a client.

    • By-pass or replace the network component temporarily.

    • Diagnose Unreliable Networking: This is, of course, hard to pin down to any one thing.

      • Is there anything physically distinct about how the client is connected? If not, you may want to try the following to isolate the unreliable component.

        • Use an alternate switch port.

        • Use a temporary network card such as a USB Ethernet dongle.

        • Check the server events logs for general network problems near the same time the backup is run.

          The agent’s source name is RVXBRagent.

  3. Configure for unreliable networks.

    • Heartbeat Interval = 300.

    • Set the Heartbeat option in the client section for the laptops in the bacula-dir configuration file and in the SD configuration file.

      The Bacula default is set to time-out based on a LAN environment. When the FD <- > SD is talking over a WAN segment, it is too slow for the default setting and the jobs terminate. The issue that has been seen when first setting up Bacula is 99% of the backups are remote clients over a WAN. Adding the Heartbeat option cleared that issue up. The only times there is an issue now is when a client is off line.

    • One other thing to check is the DNS/IP configuration.

      If the environment is not using DNS, make sure the client machines have fixed IP addresses in the TCP/IP settings for Airport. Those need to match each client configuration settings in the bacula-dir configuration file.

  4. If none of the above seems to help, check to see if VSS is enabled on the client:

    1. Go to Clients › Edit › Fileset.

    2. Select the pencil button, and look for the VSS checkbox.

    3. If it is enabled, disable it and try to run the backup again.

If the backup completes with VSS disabled but fails with VSS enabled, it is likely an issue with the VSS on the Microsoft server. Contact your VSS admin or Microsoft for assistance in resolving the VSS issues.

While we are not VSS admins, we will also do what we can to assist you in troubleshooting this issue, so you can also contact Infrascale Support.

Symptoms

All backups, including full, of a VMware virtual machine (VM) fail with the following error message:

Source and destination disk size doesn't match 

Cause

VM hard disk (VMDK) is not properly aligned meaning that its size is not multiple of 1 kB (or the disk has an odd number of sectors). Thus, the disk size reported by two different VMware API calls is different.

VMware UI does not allow you to create a disk with size not multiple of 1 kB, but a disk that was not created in the UI can easily be like this. For example, if:

  • Hard disk was created from the command line

  • VM was imported from OVA

  • VM was converted from another hypervisor

  • VM was created as part of converting a physical machine (P2V)

Resolution

Resize the disk to be an even number of sectors. The easy way to do this is to increase the disk size.

  1. In vSphere Client, find the respective VM.

  2. Right-click the VM, and then click Edit Settings in the context menu.

    The Edit Settings dialog opens.

  3. On the Virtual Hardware tab, check sizes of VM hard disks.

    Virtual hardware

  4. If size of any disk is not a whole number (14.9071168899 in the example screenshot), change it to the nearest integer (for example, 15).

    Disk size

Sometimes, this error may occur even if a VM disk size is already an integer. In this case, increase the disk size to the next integer.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following disaster recovery image backup error:

DR Image Backup Failure: redstone.xmlrpc.XmlRpcException: A network error occurred. 
BeforeJob: A failure has occurred, Job is terminating.
* VimSdk component failed on the agent machine. Win32 error: 0x00000064 The specified network name is no longer available.

Error description

This error usually indicates an unstable connection between the client-server and the appliance when backup attempts to be uploaded onto the SMB share on the appliance. In most of the cases that issue is caused by the SMB1 protocol used on the client, which proved to be unreliable and not secure.

Steps to resolve

To fix this, disable the SMB1 protocol.

Error message

After updating the appliance firmware to version 8.7, DR image and Hyper-V backups could start failing with the following error:

System.ComponentModel.Win32Exception (0x80004005): The specified network password is not correct 
Example message logs

07-Jun-21 22:12:59: BeforeJob: run command “/raider/etc/runBeforeJob.sh 1957 GMSERVER16:Imaged.2021-06-07_15 00000000-0000-0000-0000-000000000000 Backup Full”
07-Jun-21 22:12:59: Requested [Auto] DR engine. Checking client DR engine configuration…
07-Jun-21 22:12:59: Will continue with [Standard Standalone] DR engine.
07-Jun-21 22:12:59: Starting physical backup. Level=[F], Volumes=[*]
07-Jun-21 22:12:59: Connecting to the client [198.51.100.1]
07-Jun-21 22:12:59: Pre-processing…
07-Jun-21 22:12:59: Awaiting client to become available…
07-Jun-21 22:12:59: Start backup task.
07-Jun-21 22:12:59: State : [Running]
07-Jun-21 22:13:04: Unsuccessful connection attempt. Result=[86] – The specified network password is not correct
07-Jun-21 22:13:04: System.ComponentModel.Win32Exception (0x80004005): The specified network password is not correct
07-Jun-21 22:13:04: [ApplicationException] – Execution failure: bad exit code [-1] 0xFFFFFFFF; Please refer to system logs for more details; See more details in client logs.
07-Jun-21 22:13:04: State : [Terminated]
07-Jun-21 22:13:04: Job Failure.com.infrascale.paragon.clientservice.exceptions.PrmActivityException: Task status: [Fail], state: [Terminated]
• Unsuccessful connection attempt. Result=[86] – The specified network password is not correct
• System.ComponentModel.Win32Exception (0x80004005): The specified network password is not correct
• [ApplicationException] – Execution failure: bad exit code [-1] 0xFFFFFFFF; Please refer to system logs for more details; See more details in client logs.

07-Jun-21 22:13:04: Job started, firmware version: 8.7.0.102, client id: 00000000-0000-0000-0000-000000000000, client ip: 198.51.100.1, agent id: Windows Server 2016,MVS,NT 10.0.14393 (64-bit)
07-Jun-21 22:13:05: BeforeJob: A failure has occured, Job is terminating.
07-Jun-21 22:13:05: Runscript: BeforeJob returned non-zero status=1. ERR=Child exited with code 1

 

Error description

DR image and Hyper-V backups are using SMB protocol to transfer backup data from clients to the appliance network share, and the first step when starting those types of backup will be establishing SMB connection to the appliance. Establishing of this connection requires using of authentication protocols.

The root cause of the failure is that the appliance with firmware version 8.7 has tightened authentication protocol and by default permits only NTLMv2 authentication. So servers that allow only NTLMv1 for SMB connection (which should not be used for long time anymore due to security risks) will not be able to establish SMB connection to the appliance.

Steps to resolve

Permit the usage of NTLMv2 authentication protocol on the server:

  1. Path in the UI:

    For the Local Security Policy (secpol.msc) tool, go to Security Settings › Local Policies › Security Options › Network security: LAN Manager authentication level.

    Set it to Send NTLMv2 authentication only, or DC refuses LM authentication, or DC refuses LM and NTLM authentication (accepts only NTLMv2). Only these values allow for authentication on the appliance network share.

  2. Path through the modifying registry key:

    NTLM security is controlled via the registry key HKEY_LOCAL_MACHINESystemCurrentControlSetcontrolLSA.

    Choice of the authentication protocol variants used and accepted is through the following value of that key:

    ValueTypeNumber Valid Range
    LMCompatibilityLevelREG_DWORDFrom 0 to 5 (default is 0)

    This parameter specifies the type of authentication to be used:

    • Level 0 — Send LM response and NTLM response; never use NTLMv2 session security

    • Level 1 — Use NTLMv2 session security if negotiated

    • Level 2 — Send NTLM authentication only

    • Level 3 — Send NTLMv2 authentication only

    • Level 4 — DC refuses LM authentication

    • Level 5 — DC refuses LM and NTLM authentication (accepts only NTLMv2)

    Only levels 34, and 5 will allow establishing the SMB connection to the appliance.

Extra information

SettingDescriptionRegistry security level
Send LM & NTLM responsesClient devices use LM and NTLM authentication, and they never use NTLMv2 session security. Domain controllers accept LM, NTLM, and NTLMv2 authentication.0
Send LM & NTLM — use NTLMv2 session security if negotiatedClient devices use LM and NTLM authentication, and they use NTLMv2 session security if the server supports it. Domain controllers accept LM, NTLM, and NTLMv2 authentication.1
Send NTLM response onlyClient devices use NTLMv1 authentication, and they use NTLMv2 session security if the server supports it. Domain controllers accept LM, NTLM, and NTLMv2 authentication.2
Send NTLMv2 response onlyClient devices use NTLMv2 authentication, and they use NTLMv2 session security if the server supports it. Domain controllers accept LM, NTLM, and NTLMv2 authentication.3
Send NTLMv2 response only. Refuse LMClient devices use NTLMv2 authentication, and they use NTLMv2 session security if the server supports it. Domain controllers refuse to accept LM authentication, and they will accept only NTLM and NTLMv2 authentication.4
Send NTLMv2 response only. Refuse LM & NTLMClient devices use NTLMv2 authentication, and they use NTLMv2 session security if the server supports it. Domain controllers refuse to accept LM and NTLM authentication, and they will accept only NTLMv2 authentication.5

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

bsock.c:184 Unable to connect to Storage daemon on 198.51.100.15:9103. ERR=Connection refused 

You may also see the following warnings in the Notices widget on the Dashboard tab:

insufficient storage space on /raid 
storage space critically low /raid
storage space critically low /root full
storage space critically low /var/log full

If you see storage space critically low /raid/:catalogs: or primary database is inaccessible, this is a different error relating to the database space. Contact Infrascale Support.

Description

This error message shows when the RAID is full and no extra jobs can be added. Appliance does not automatically overwrite older jobs, so any new job fails if the appliance does not have enough space to store them.

Steps to resolve

To resolve this issue, clear space on the appliance disks.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

VMwareBackupJobRunner: error encountered during VM backup: com.rvx.vmware.exceptions.VmwareBackupException: com.rvx.vmware.exceptions.RunnerStatusFailedException: com.rvx.nativ.vmware.vixexceptions.VixEDISKOUTOFRANGE: You have requested access to an area of the virtual disk that is out of bounds 

Error description

This error is generated when the Changed Block Tracking (CBT) is attempting to gather data from a VMDK, and VMware believes the data is outside the range of the VMDK. This error shows up when the VMDK size is increased.

Steps to resolve

Reset CBT on the VM to allow backups to run correctly.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

VMwareBackupJobRunner: error encountered during VM backup: com.rvx.vmware.exceptions.VmwareBackupException: com.rvx.vmware.exceptions.RunnerStatusFailedException: com.rvx.nativ.vmware.vixexceptions.VixEHOSTNETWORKCONNREFUSED: NBD_ERR_NETWORK_CONNECT 

Error description

A VMware VM backup job fails immediately at waiting for disk connection or is pending for up to 10 minutes at waiting for disk connection.

The command to create the snapshot goes out to the host over 443, and some info is gathered at the same time. Then we connect over port 902 to the host to open the VMDK and copy the contents.

If the connection from the appliance to the host on port 902 is blocked, then we will not be able to access the contents of the VMDK we are trying to back up. If the connection is rejected, the job may fail in only a few seconds. If the connection is dropped, then we will have to time out (which seems to take about 10 minutes).

Steps to resolve

Verify access to port 902. For this:

  1. In the appliance Management Console, go to the Support tab.

  2. In the Run box, enter telnet host 902 where host is the name of the VMware host, and then click Execute.

    root@qa-007:ssh:~
    $ telnet qaesx1 902
    Trying 172.16.5.5...
    Connected to qaesx1.infrascale.co
    Escape character is '^]'.
    220 VMware Authentication Daemon Version 1.10: SSL Required, ServerDaemonProtocol:SOAP, MKSDisplayProtocol:VNC, VMXARGS supported, NFCSSL supported
    root@qa-007:ssh:~ 
    $ telnet qaesx2 902
    Trying 172.16.5.6...

Error messages

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

VSS Snapshot Creation failed. Check Log file for more details. 

Error description

The VSS failure you see is due to insufficient storage space available to create the VSS snapshot.

The actual error message in the trace file is the following:

VSS_E_INSUFFICIENT_STORAGE 

Steps to resolve

Run systeminfo from the Support tab, and you can see the available resources for the server.

If this is a VM, you may be able to allocate more system resources to that server and get away from the error.

See Microsoft knowledge base article for details.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

DR Image Backup Engine Error: 
- VSS Writer NTDS is in FailedAtPostSnapshot state, Writer Failure Code=[WriterErrorNonRetryable]

Error description

This error appears if using any other backup or DR solution along with Infrascale software on the same computer.

Some vendors install their custom VSS providers. Windows then uses these providers by default, and also forces the standard DR engine to use them. This results in the backup failures, and may result in data consistency issues even in successful backups.

Steps to resolve

Check if Windows uses a third-party VSS provider by default:

  1. Open Command Prompt as administrator.

  2. In the Command Prompt window, run the following command:

    vssadmin list providers 

    This will list the VSS providers available to and used in Windows. For example:

    List of VSS providers

  3. Check for a third-party VSS provider on the list.

    In the example screenshot, it is StorageCraft Volume Snapshot Software Provider. Windows forces the standard DR engine to use this VSS provider by default, and thus the backup jobs fail.

  4. To resolve the issue:

    • Uninstall any other backup software from the affected computer, restart the computer, and then check if the problematic VSS provider was removed successfully.

    • Another option, while not recommended, is to use the legacy DR engine instead of the Standard one.

      Using Infrascale software along with any other backup solutions may cause unexpected backup failures or data consistency issues even in successful backups.

Error message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

Restore Failure: Error(s) occurred while running the necessary scripts to complete restore. 

Error description

This error may show when restoring files originally backed up from a VMware VM.

The message indicates that the files were restored, but the script that is ran to set the file permissions after restoring the files, either ran into an error or failed to run.

Steps to resolve

You can try to manually run the script again or set the permissions on the files restored manually.

Issue

After restoring a virtual machine (VM) client, you are unable to boot due to the blue screen errors on the VM with a Windows guest operating system.

Issue description

Some older versions of Windows may have been backed up on an older VM hardware version, during the restore, the host will default to the latest hardware version that ESX will support:

  • ESX 5.5 = 10

  • ESX 5.1 = 9

  • ESX 5.0 = 8

  • ESX 4.0 = 7

  • ESX 3.5 = 4

There seems to be some slight changes in how boot is handled in VMware between hardware versions 4 and 7, which have been known to cause the Blue Screen errors on boot.

Steps to resolve

Since there is no way to tell vCenter or a Host to use a specific hardware version during the restore, we have found a work around that has worked in during testing.

After restoring the VM client, you will want to delete the client created by the restore. Make sure to leave the VMDKs on the datastore, we can then create a new client in VMware specifying what hardware version to use and connect to the VMDK created by the restore. This should allow us to boot the client without receiving the blue screen errors on boot.

Warning message

Duplicate VMware connections have been eliminated. Review your connections. 

Warning in the Management Console

Example warning message in the Management Console

Warning in the Dashboard

Example warning message in the Dashboard

Warning description

The VMware host sends a command to the Backup & Disaster Recovery appliance, and appliance interprets the command as a duplicate connection from the same VMware host.

To prevent any complications for the backup sets, their schedules and data consistency, appliance automatically removes all artifacts related to the connection, and shows the respective message.

Steps to resolve

  1. In the appliance Management Console, go to Clients › VMware.

  2. Check for the duplicate connections from the same VMware host.

  3. Check for the duplicate virtual machines and the respective clients.

  4. On the Dashboard tab, click the warning in the Notices widget.

    The system will immediately clear the warning from appliance, and also from the Infrascale Dashboard in five minutes.

Warning message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

Restore Warning(s): The scripts required to complete the restore did not successfully run. 

Error description

This warning may show because of the errors occurred when restoring files originally backed up from a VMware VM.

The message indicates that the files were restored, but the script that is ran to set the file permissions after restoring the files, either ran into an error or failed to run.

Steps to resolve

You can try to manually run the script again or set the permissions on the files restored manually.

Warning message

Message logs in the Management Console of the Backup & Disaster Recovery appliance show the following error:

The appliance is unable to connect and send monitoring data to the Dashboard. 
Click here for more information on possible causes and solutions.

Warning

Warning description

Appliance shows this warning if:

  • someone changed the account credentials (username or password) used to register appliance in the Infrascale Dashboard; or

  • the account, under which appliance was registered in the Infrascale Dashboard, was deactivated (canceled or suspended); or

  • the trial period for the account, under which appliance was registered in the Infrascale Dashboard, ended; or

  • the account, under which appliance was registered in the Infrascale Dashboard, was locked because of the security reasons; or

  • the firewall blocks the ports that appliance uses to communicate with the Infrascale Dashboard.

Previous

Used cloud storage within Backup & Disaster Recovery

Next

VMware-related frequently asked questions