<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Storage Architect &#187; archiving</title>
	<atom:link href="http://thestoragearchitect.com/tag/archiving/feed/" rel="self" type="application/rss+xml" />
	<link>http://thestoragearchitect.com</link>
	<description>Storage, Virtualisation &#38; Cloud</description>
	<lastBuildDate>Tue, 07 Feb 2012 10:08:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Cloud Computing: Trawling the Toxic Wastedump</title>
		<link>http://thestoragearchitect.com/2009/03/20/cloud-computing-trawling-the-toxic-wastedump/</link>
		<comments>http://thestoragearchitect.com/2009/03/20/cloud-computing-trawling-the-toxic-wastedump/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 13:35:00 +0000</pubDate>
		<dc:creator>Chris M Evans</dc:creator>
				<category><![CDATA[Cloud]]></category>
		<category><![CDATA[archiving]]></category>
		<category><![CDATA[backup]]></category>
		<category><![CDATA[Cloud storage]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[filers]]></category>
		<category><![CDATA[files]]></category>
		<category><![CDATA[toxic files]]></category>
		<category><![CDATA[toxic waste]]></category>

		<guid isPermaLink="false">http://thestoragearchitect.com/?p=433</guid>
		<description><![CDATA[<p><a href="http://thestoragearchitect.com/2009/03/20/cloud-computing-trawling-the-toxic-wastedump/recycle/" rel="attachment wp-att-434" ></a>A quick Tweet with Chris Mellor just reminded me of something I touched on a <a href="http://thestoragearchitect.com/2008/12/15/redundant-array-of-inexpensive-clouds-pt-i/" >few months ago</a> but always meant to write about in more detail.  It&#8217;s a semi-serious analogy (it is Friday after all) but there&#8217;s a hint of the possible about it, so here goes.</p> <p>We&#8217;re [...]<!--Begin ClixTrac.com Rotator Code -->
<script type="text/javascript" language="javascript" src="http://www.clixtrac.com/rotate/321"></script>
<!--End ClixTrac.com Rotator Code -->]]></description>
			<content:encoded><![CDATA[<p><a href="http://thestoragearchitect.com/2009/03/20/cloud-computing-trawling-the-toxic-wastedump/recycle/" rel="attachment wp-att-434" ><img class="alignright size-full wp-image-434" title="recycle" src="http://thestoragearchitect.files.wordpress.com/2009/03/recycle.jpg" alt="recycle" width="182" height="182" /></a>A quick Tweet with Chris Mellor just reminded me of something I touched on a <a href="http://thestoragearchitect.com/2008/12/15/redundant-array-of-inexpensive-clouds-pt-i/" >few months ago</a> but always meant to write about in more detail.  It&#8217;s a semi-serious analogy (it is Friday after all) but there&#8217;s a hint of the possible about it, so here goes.</p>
<p>We&#8217;re all creating too much content.  Whether that&#8217;s at home or at work, there&#8217;s just too much stuff about.  It&#8217;s duplicated files, downloads of software, documents we created and never looked at, pictures (some useful, some we&#8217;ll never print but just retain regardless).  What we need is a waste dump for all of this infrequently referenced content.</p>
<p>So, the cloud storage offerings become an extension of your desktop.  A background task farms through your files, looking for stuff you&#8217;ve not referenced for a while, pushing it off to a cloud storage vendor and leaving a stub or link to the file in case you need it.  They analyse the file, determine whether the content is useful and push it back to you if it is, delete it if not (alternatively you get a report indicating whether your data is any use &#8211; you choose to delete or retain at low cost in the cloud).  All of the functionality is policy-based, set by the customer.</p>
<p>What&#8217;s in it for you?  As a home user, you get your pictures, word documents, video, MP3s all nicely indexed and managed.  You don&#8217;t have to think about the process of keeping stuff organised, just throw it at your virtual recycling centre.  If you&#8217;re a business, your files gets scanned for illegal or copyright content; it gets checked for files which meet certain policies on business-related content &#8211; it could even be farmed for the content, depending on what the files are.  As a business, you keep your onsite storage costs down, you get offsite backup &amp; location independence and the ability to certify to the authorities that you pro-actively seek out and remove content which may result in litigation from copyright owners.</p>
<p>What does the vendor get?  Well, customers may choose to backup and/or archive their files &#8211; for a fee.  We&#8217;re all inherently lazy and if someone else confirms that you&#8217;re only backing up what&#8217;s necessary then you&#8217;ll be attracted to the service.  Businesses can be charged for processing content and the cost made more attractive than simply keeping more data on bigger and bigger filers.  Bear in mind that cloud vendors are going to have to differentiate themselves somehow.</p>
]]></content:encoded>
			<wfw:commentRss>http://thestoragearchitect.com/2009/03/20/cloud-computing-trawling-the-toxic-wastedump/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Taking out the trash</title>
		<link>http://thestoragearchitect.com/2007/12/17/taking-out-the-trash/</link>
		<comments>http://thestoragearchitect.com/2007/12/17/taking-out-the-trash/#comments</comments>
		<pubDate>Mon, 17 Dec 2007 18:25:00 +0000</pubDate>
		<dc:creator>Chris M Evans</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[archiving]]></category>
		<category><![CDATA[data growth]]></category>
		<category><![CDATA[de-duplication]]></category>
		<category><![CDATA[file systems]]></category>
		<category><![CDATA[tiering]]></category>
		<category><![CDATA[unstructured data]]></category>

		<guid isPermaLink="false">http://thestoragearchitect.wordpress.com/2007/12/17/taking-out-the-trash/</guid>
		<description><![CDATA[<p>In a <a href="http://feeds.hds.com/~r/hds/hu-yoshida/~3/197673863/the_changing_enterprise_data_profile-_idc.html" >recent post</a>, Hu Yoshida refrences an IDC presentation discussing the rate of growth of structured versus unstructured data. It seems that we can expect unstructured data to grow at a rate of some 63.7% annually. I wonder what actual percentage of this data represents useful information?</p> <p>Personally I know I&#8217;m guilty [...]<!--Begin ClixTrac.com Rotator Code -->
<script type="text/javascript" language="javascript" src="http://www.clixtrac.com/rotate/321"></script>
<!--End ClixTrac.com Rotator Code -->]]></description>
			<content:encoded><![CDATA[<p>In a <a href="http://feeds.hds.com/~r/hds/hu-yoshida/~3/197673863/the_changing_enterprise_data_profile-_idc.html" >recent post</a>, Hu Yoshida refrences an IDC presentation discussing the rate of growth of structured versus unstructured data. It seems that we can expect unstructured data to grow at a rate of some 63.7% annually. I wonder what actual percentage of this data represents useful information?</p>
<p>Personally I know I&#8217;m guilty of data untidiness. I have a business file server on which I heap more data on a regular basis. Some of it is easy to structure; Excel and Word documents usually get named with something meaningful. Other stuff is less tangible. I download and evaluate a lot of software and end up with dozens (if not hundreds) of executables, msi and zip files, most of which are cryptically named by their providers.</p>
<p>Now the (personal) answer is to be more organised. Every time I download something, I could store it in a new structured folder. However life isn&#8217;t that simple. I&#8217;m on the move a lot and may download something at an Internet cafe or elsewhere where I&#8217;m offline from my main server. Whilst I use offline folders and synch a lot of data, I don&#8217;t want to synch my entire server filesystem. The alternative is to create a local image of my server folders and copy data over on a regular basis, trouble is, that&#8217;s just too tedious and when I have oodles of storage space, why should I bother wasting my time? There will of course come a time when I have to act. I will need to upgrade to bigger or more drives and I will have (more) issues with backup.</p>
<p>How much of the unstructured data growth out there occurs for the same issues? I think most of it. I can&#8217;t believe we are really creating real useful content at a rate of 63.7% per year. I think we&#8217;re creating a lot of garbage that people are too scared to delete and can&#8217;t filter adequately using existing tools.</p>
<p>OK, there are things out there to smooth over the cracks and partially address the issues. We &#8220;archive&#8221;, &#8220;dedupe&#8221;, &#8220;tier&#8221; but essentially we don&#8217;t *delete*. I think if many more organisations operated a strict Delete Policy on certain types of data after a fixed non-access time, then we would all go a long way to cutting the 63.7% down to a more manageable figure.</p>
<p><em>Note to self: spend 1 hour a week tidying up my file systems and taking out the trash&#8230;..</em>
<div class="blogger-post-footer">
<p>_uacct = &#8220;UA-1104321-2&#8243;;<br />
urchinTracker();
</p></div>
]]></content:encoded>
			<wfw:commentRss>http://thestoragearchitect.com/2007/12/17/taking-out-the-trash/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

