Easy & Secure Cloud Storage: Parker’s Cloud History

Parker College of Chiropractic has been using Amazon “Simple Storage Service” (S3) for incidental data storage of audio files and documents since May 2009. Interest in the arena was primarily due to the “cool” factor, and secondarily to as a way to cheaply archive Apreso generated course lecture MP3 files from the previous academic term at a time when available SAN disk space was at a premium. As the dean had declared Information Services would not make the files available to students of subsequent terms, the archive had no practical cause for being retained beyond a surfeit of caution. The practice of retaining the previous term lecture recordings has been continued. Student’s have consistently requested access to previous term lectures and been denied access in accordance with policy; it wasn’t until the winter of 2010/11 during snow days that requests for student access to specific lectures from the previous term was initially requested by faculty.

Specific goals during this initial implementation were:

  • Attach the Amazon S3 drive space to a network server in the user profile to view and monitor files manually
  • Attach the Amazon S3 drive space to a network server accessible to the Apache Server of Apreso as the “Archive” drive, for the storage of MP3 files, et. al.
  • Files were to be moved, not simply backed up to the Amazon S3 space.
  • The drive was to function for all practical purposes as a LAN network drive with the understanding that it would be slower; cache and associated functionality should be transparent to the system in general.
  • File data encryption on the S3 drives was undesirable due to the “new” nature of cloud storage, and the apparently transient nature of the applications; it was important to successfully access the files from any of several available applications in the event the one in use was faulty or for other reasons abandoned.
  • Incremental availability, and the opportunity to pay only for what disk space is used makes cloud space cheaper in the short and mid-term than a new SAN.

The initial implementation of Amazon S3 cloud space had been problematic. WebDrive v9.00 was chosen was because it was one of the few applications meeting our requirements, and it was relatively inexpensive. Files uploaded to the WebDrive were also readable using S3Fox, a Mozilla Firefox Add-In, and several other similar applications. WebDrive added a network drive attached to Parker’s Amazon S3 space to the user profile. It was possible to configure the application to make the network drive available to services such as IIS for the storage and access of archived Apreso mp3 files.

Several problems ensued which reduced the implementation to marginal success:

  • The application was not command line configurable, i.e., not dynamic from a programming perspective which made repeated installation and use on multiple servers time consuming and cumbersome.
  • When the drive was configured for network services access, it was not visible in Windows Explorer to the logged on user.
  • When the drive was configured for network services access and simultaneously mapped for the user, WebDrive presented a duplicated folder structure which could be confusing but was otherwise without noticeable impact.
  • The manner in which WebDrive handled it’s cache occasionally created problems:
    • first from the cache filling up the drive,
    • then once the cache was configured to minimal size from some files being lost or corrupt
    • For our purposes, the cache was of no practical value except purely as a data buffer for files in transit.
  • By the time Apreso was updated to Echo360, all of this archived data was deemed sufficiently old or of such compromised quality that it was deleted.
  • File name symbol translation used by WebDrive was faulty
    • some files were uploaded that WebDrive was unable to delete
    • some files once uploaded could not be retrieve back to the Parker campus file system.
    • It was also observed that S3Fox and other such applications had similar problems, and some files/folders were not finally deleted until late 2010.

Cloud storage of MP3 files ended with the Echo360 implementation in the Parker College of Chiropractic AT&T IDC Environment where one terabyte of disk space was made available for online storage and a second terabyte was made available for archive storage. Due to incompatibility with Windows 7.0, WebDrive was replaced with Gladinet Desktop Professional Edition V2.3.432.9299 in September 2010 and is used both professionally and personally on my laptop, and desktop computers, at work and home.

Some interesting developments in Cloud storage have arisen, and Chris (my CIO) expressed renewed interest stating that if we can enumerate advantages we may incorporated this technology into the coming year budget process. Summary developments in cloud data storage and some desirable features:

  • Virtual Devices in the VMWare ESX environment make it possible to treat Amazon S3 space as Network Attached Storage with $0 hardware investment.
  • The option to transition to a hardware device from a virtual device would be ideal.
  • De-duplication has been applied by some applications reducing the size of data on cloud disks, further reducing storage cost.
  • Data encryption may now be in use that is readable between applications
  • Synchronous snapshots periodically capturing the entire file system, and deduplicated snapshots stored in the cloud function as data backup with little additional expense
  • Block and file-level access may both be available
  • Data recovery is possible from the prior snapshots, and streaming the data on premise from the cloud.
  • Sophisticated per volume caching may be available to tier the data access time based on object access frequency emulating an enterprise archive of sophisticated pedigree
  • Cloud storage is highly protected against not only a disk drive failure, but also an entire array failure or even an entire site failure and information survives those kinds of events and remains accessible on demand.
  • Directory integration of the the cloud file system security
  • The ability to replicate a “system” to the cloud and make it active would be totally awesome; to have the replicant available only if needed would be even nicer.
  • Data in the cloud should be conditionally accessible to other applications or for direct web access: i.e., Echo360 media files are accessed directly from the web server and WOWza server efficiently, those files should be able to be stored on a NAS Drive, transition to the cloud, and be directly accessible.

There are numerous vendors offering “gateway” products that may be of interest to Parker in the coming years, some of these are:

Over the next few weeks I’ll be reviewing these and other similar products for Parker’s potential use. I’ll document my thoughts here, and I’ll be presenting my findings to the rest of my department on May 3, 2011.

About Erik

A Computer Guy
This entry was posted in Administration, Cool Stuff. Bookmark the permalink.