The Archiving capability in our Enterprise Server Edition backup client was developed to reduce disk usage on a backed-up computer by removing redundant, obsolete and trivial (ROT) data from your primary storage. This can assist enterprises in dealing with rapidly growing volumes of unstructured data without resorting to purchasing more primary storage.
All archived data remains immediately accessible on-demand on the originating computer.
See Archiving in action here:
- Saves disk space by removing old data
- Saves disk usage costs by essentially tiering data from local high-cost storage to lower-cost storage platform
- Archived data is always available on demand
- Data redundancy is ensured before archiving occurs as multiple copies are held in separate locations
- Safety features are in place to ensure normal operation of a computer with archived data
- Easy to use with the ESE backup client
To activate and configure Archiving immediately, see Article 1175.
Stubbing refers to the process in which the original files are replaced with sparse files that take up much less space. The originals are stored on Redstor’s Storage Platform while still being accessible from the user’s computer immediately via the sparse files. This offers significant benefits over on-disk storage.
The user determines which files get stubbed – typically, this is based on a file’s last accessed time. Since last accessed times are not automatically kept up to date by the computer’s operating system, the Archiving service itself monitors access to files and ensures that the last access times stay updated.
The resulting files left on the disk after stubbing has taken place.
The process of downloading the contents of a previously stubbed file from the Storage Platform. Files can be selected for rehydration through the ESE backup client or rehydrated on-demand as they are accessed.
A Windows service that is installed when archiving is enabled in the ESE backup client. This service manages the relationship between the operating system, file system and the backup client. It ensures that stubs are rehydrated when accessed.
Note: When stubs are accessed, only the relevant data is rehydrated. This results in a stub being partially rehydrated. Also, files removed from the backup selection are automatically rehydrated at the next cycle.
Stages of the Archiving process
- Compatibility and version checks performed.
- Account is checked for an active Archiving licence.
- Client confirms that the data has been backed up.
- Client confirms that the mirroring of data has completed so that two copies of the data exist in the lower tier storage.
- The rehydration service is installed and checked. (See Key concepts above.) If it already exists, it is updated to the appropriate version and its status verified before archiving can proceed.
- If the above checks are successful, all eligible files are stubbed, based on the archival selection.
- Files that no longer fulfil the archival selection criteria are rehydrated.
- When stubs are accessed, only the relevant data is rehydrated. This results in a stub being partially rehydrated. Also, files removed from the backup selection are automatically rehydrated at the next cycle.
- Archiving is triggered after backups to ensure that archived data is consistent.
- Files required by critical applications and frequently modified will not be stubbed. For example, SQL Server and Microsoft Exchange database files.
- Logs are generated during the process and all actions taken can be reviewed in the Logs tab.
When will a file be stubbed?
It will be stubbed if it:
- is in the archive selection
- is in the most recent backup on the SS
- is in the most recent backup on the MS
- is not in the system state selection (or selected by any VSS writer)
- has not been recently modified (based on not accessed in x days)
- has not been recently accessed (based on not accessed in x days)
- is greater or equal to 1KB
- is less than or equal to 64GB
- is not open
When will a file be rehydrated?
It will be rehydrated if it:
- has recently been modified (based on not accessed in x days)
- is not in the archive selection
- is not in the most recent backup on the SS
- is not in the most recent backup on the MS
- is not a stub from this account
What is the relevance of last accessed time to archiving?
Archiving uses the last accessed time of a file to determine if it should be archived or not. However, in most installations the last accessed times are not kept up to date by Windows. Archiving therefore monitors access to files and when a file is accessed it will update the last access time itself. This means that last accessed times for files will be updated immediately when a file is accessed for whatever reason (for instance right-click > Properties on an image file to get the dimensions).
Last accessed times are only updated once archiving is enabled. This means that for the first archiving run it is possible that files that have been recently accessed are actually still archived because they have not been accessed AFTER archiving was enabled. In such a case we work on whatever is the most recent - Last Accessed or Last Modified. Of course, these files will get rehydrated the moment they are accessed. The last accessed time will then be updated to determine whether it is archived next time.
This time difference just means that more files may initially get archived than intended or expected. There is a setting which introduces a 'lag time', effectively freezing the stubbing process for a specified number of days in order to minimise this behaviour. The default is 0 and it can only be set in the properties file.
To get the driver installed and update Last Accessed times without running an archive, use the Calculate Savings feature. This can be cancelled after the driver has been installed.
What happens when anti-virus software scans or changes the files?
The filter driver sits below the anti-virus drivers, so we intercept all the calls, including those from anti-virus. Anti-virus software opens the files in such a way that we can see it’s only been read for scanning purposes, in which case only the already rehydrated portions are returned. This means our archiving software can co-exist with anti-virus software, and the anti-virus won’t trigger the rehydration of the files.
In some anti-virus software, you may need to adjust the security settings for Archiving to work properly. In ESET, for example, you need to select the option Preserve last access timestamp to keep the original access time of scanned files, instead of updating them. If this option is not enabled, files will not be stubbed.
What is the minimum file size criteria (if any) for stubbing and archiving a file, assuming that very small files (< 1 MB) don't need to be archived?
Files smaller than 1kB will not be stubbed.
What is the exact size of the stub file on disk?
Depending on how the volume is formatted, usually 4KB.
Are you limited to the file length path for archiving?
The path can be anything since the filter driver only uses file IDs. So the filename can have obscure characters or be any length. It all looks the same to the filter driver.
Restore / rehydrate FAQs
If files are encrypted due to ransomware, can these be restored as normal?
We need to distinguish between whether ransomware encrypted the files before or after the archiving process.
- If ransomware kicks in AFTER archiving, it will rehydrate a file as it reads it and will then write encrypted data back. At that point, archiving is no longer relevant as those files are now local. To recover, you will do a normal restore as you would for a backup.
- If the files were encrypted BEFORE archiving but after backup, there would be no rehydration activity, so files could still be restored as normal.
In a disaster recovery scenario, would the client restore stubs where disks/folders containing stubs were lost or would data be restored in a rehydrated state?
At present it will restore the full file. We do create stub files first when you restore with InstantData Permanent, but those stubs are rehydrated as part of the restore. The restoring of stubs is on our development roadmap.
Once files are rehydrated, will they re-stub again after the archive trigger?
Yes. If a stubbed file gets rehydrated and is not accessed again for the specified number of days, it will then be stubbed again.
This is why it is best practice to have a retention period of at least a few weeks, because files that are regularly being accessed could otherwise get stubbed, rehydrated, etc.
How long can data be archived for?
As long as the backups are kept (based on the retention settings for the collection/group).
If a stub is deleted will the file be deleted as well?
Yes. A stub and a normal file are treated in the same way. As soon as you delete a stub/file it will be removed from the backups and eventually that file will get flushed out with the roll-ups.
If I restore a deleted archived file, what will be restored - the archived file or the original file?
For now we only restore full files, we do not just restore stubs.
What happens if there is insufficient disk space to rehydrate the files when they get recalled from the backup selection?
This is a likely scenario - we call it over-subscription - where you have more data archived than your local disk can hold. In such a case we hydrate as far as we can.
What will happen on roll-ups?
Normally stubbed files will remain in the selection – this should ensure they are not flushed out with a roll-up. If not, they will automatically get rehydrated in which case they can be rolled up. As a fail-safe we have a special list that we maintain on the Platform, which includes all the archived files, and whenever we do roll-ups we will not delete those files.
Use case FAQs
Does this work with file servers? Will machines that are accessing remotely trigger the rehydration process?
Yes, it works with file servers. In fact, that will probably be the most common use case. File servers are a brilliant candidate for archiving, especially since many of those files are not being used. So if you run the Redstor Pro software on the file server itself, even UNC access will trigger the rehydration process.
Can I archive files on a UNC share on a separate server?
No, the filter driver needs to be installed on the host that is serving the files, not the client accessing them.
What happens if stub files are moved outside of the backup selection to somewhere else on disk?
They will be detected as moved and will get rehydrated on the next archive run – that is why the full system is scanned.
What happens if stub files are moved outside of the backup selection to a UNC path?
The filter driver will detect that the file is moving from local storage and will rehydrate the file to the target location. The source stub will then be removed.
How does archiving work with local copy backups?
Local copy backups are not supported with Archiving. Although it is possible to enable both features, you may currently end up with archived files missing in the local copy due to stubbing.
If an absolute must, then the lower risk configuration is to only enable Archiving after the local copy functionality has been running for a few backups. But even in this configuration there is still a risk. It is also not possible to rehydrate from the local copy, as Archive data is not stored in local copy backups.
Are there any limitations around connectivity?
Rehydration is not possible when both the SS and the MS are inaccessible. This is important to keep in mind for laptop users who may not always be connected to the internet.
Can we override the date and stub files immediately when their backup has been mirrored?
Only if the last accessed and modified dates meet the archival criteria. Third-party applications such as Total Commander can set these.
If an application is not working properly for a specific file after stubbing, can we exclude the file from archiving to prevent it from being stubbed again and immediately rehydrate it?
Currently you can only exclude folders, but if a file is regularly accessed it will not get stubbed in the first place since it will have been accessed recently.
What is the troubleshooting methodology for why files have not been stubbed?
Check that both the Last Accessed and Last Modified dates are older than the threshold. Ensure the file is backed up and mirrored (AFTER MAKING ANY DATE CHANGES – the timestamps on disk must match the latest backup).
If the file is still not archiving, enable debug mode and locate the file in the service log after doing an archive. The reason for stubbing (or not) will be in brackets.
Are Read-Only files stubbable?
Does rehydration have a resume support in a power-failure scenario?
Yes. It will resume where it has left off.
Will the rehydration run from the mirror if the storage server is down?
Yes, and this happens completely transparently.
If a storage server is lost, will my data be automatically hydrated?
No, but you will not be able to archive any additional data until storage redundancy is restored.
Why don’t partially downloaded files get re-archived at the next backup?
Because their last access time is current after a read operation. When the last accessed time meets archival criteria, they will be re-archived.
Can I see how much data is partially rehydrated?
Yes, you can do this by looking at the file properties (see below).
Is the filter driver only active during the archive task, and is it doing anything when archiving is not happening?
The filter driver is always running so it always gets notified when reads and writes come through. We went to a lot of effort, though, to ensure that when you’re not reading or writing stub files there is almost zero overhead time. This process is tested automatically when you submit the driver for verification to Microsoft to ensure that the driver will not slow down the machine.
Can system files be set for exclusion by default to prevent system crashes?
You can select any folder for exclusion that you want. For now your Windows folder is excluded by default and we automatically exclude all files seen by VSS as system state files. However, that is no guarantee so we recommend that you exclude any folder accessed during boot.
How can the software handle existing archive solutions and their stub files?
We have a setting whereby you can exclude third-party stubs from the backup (and therefore archiving).
Is the stubbing and rehydrating of EFS encrypted files possible?
No, these are not supported for stubbing and will be skipped.
Reporting and analysis FAQs
Can I extract a report from the Redstor client that will provide detail on the status of the archiving and any other related information?
Yes, you will find this information on the archive logs, where you view the backup logs.
This displays which files have been stubbed, which have been rehydrated and also shows results in a summary.
Archiving information is also available in the Storage Platform Console. Here you can see Archived Data, Archived Files, and Last Archive date.
Note: If these columns are not currently visible, you can enable them by going to View > Customise Columns.
You can also schedule (or generate on demand) an Archiving report for your whole database, or for a specific group or account, by opening the Console in the Reports view.
Is it possible to see in the Storage Platform Console which customers enabled archiving?
Yes, the Console can be used to see which collections and groups are enabled for archiving.
You will see the archiving fields being populated but it is only possible for clients to see which customers are using archiving once the first archive is performed.
If I archive a file, will there be a reduction in the volume of total data selected for backup, and does this free up space for additional files to be backed up?
Your total data selected for backup will remain the same, regardless of how much of it is archived.
For example, if you have a 3.5 GB file, your data selected will be 3.5 GB. If you archive that file so that it only takes up 10KB on disk, the data protected is still 3.5 GB because that is what is stored on the Storage Platform and associated with the client account.
We have added extra columns in the console so you can see how much has been archived.
How will archiving affect my storage usage per account on the platform? For example, is data deduplicated between my backups and the archives?
Your data selection remains exactly the same, and the data protected stays unchanged too. You are basically specifying that a portion of your backed-up data is also archived, but you are still storing the same amount of data on the platform which is associated with the client account.
Do you have a data insight tool that can work out how much will be archived and how much space will be saved?
There are two ways to do this:
1. In the Pro software there is a button that allows you to calculate your savings.
2. We have made it possible to scan your entire system so you can see how much you can save. A slider is used to set the archiving threshold to however many days you want. This adjusts the graphs accordingly revealing what effect archiving will have on a particular server.
I’ve tried archiving, but I don’t like it and want to remove it. How do I do this?
If you want to keep all your data, the simplest way is to rehydrate all your data and disable the feature.
- Click the “Archive” button to open the Start Archiving menu, then click on “Show Advanced Options”.
- Click on “Rehydrate all stubs” and finally click on “Start Rehydration”.
- When the process has completed, run the rehydration again (see below for note about archive stats).
- Finally, in the Archiving options, untick the checkbox marked “Stub backed up files not accessed in the last” to disable the feature.
What happens if my backup client is uninstalled?
Any further stubbing will stop. All existing stubs will remain stubbed until they are accessed. Files will only get rehydrated when they are completely read from disk. Even when being written to disk, they will remain partially rehydrated unless fully overwritten.
Do you get a warning if you uninstall the Redstor software before rehydrating?
You can uninstall Redstor Pro, but it will leave the filter driver and rehydration service on the server – there is no uninstall option for that, other than command line. At present, it requires a manual step where you must rehydrate all files first.
Can you reverse the process?
Yes, install the Redstor Pro client again and reconnect to the account. The rehydration service will be set up again and archiving can resume.
What happens if the account has been deleted?
It will no longer be possible to rehydrate stubs or recover files when an account that has been using archiving has been deleted.
To guard against this, it is impossible to use the Console to delete an account if it has any stub files still archived.
However, there are other ways to delete an account.
There will be times when an account needs to be automatically deleted from an evaluation group after a certain time. For that reason you cannot enable archiving on an evaluation account. But it is still possible to delete an account that has been archiving. This can be done by downgrading the account to evaluation in a group which has auto-deletion settings enabled. Please note that all stubs must be rehydrated before an account can be deleted.
What if I just disable the account instead?
Rehydration is not possible if an account or group is disabled. If this happens, users will not be able to access their archived data. Be mindful when archiving.