Friday, January 31, 2014

Objectives for 2014

I believe I missed the cutoff for the obligatory 2013 retrospective post, and I'm sure I will revisit as necessary, but I am really excited about 2014.  I've worked for VMware as a network and security specialist, a BCDR specialist, and a storage specialist and before that as a PSO solutions architect.  I enjoy getting my hands dirty and staying close to the customers and this year has lots of opportunities to do that.


Keeping with my storage roots, I am working a lot with Virtual SAN these days.  There's no shortage of "How to setup Virtual SAN" posts but why another scale-out storage platform?  Isilon is scale-out NAS that pretty much looks like racked servers, not to mention Gluster, Lustre.  Then ScaleIO, Nutanix, Simplivity, the list could go on...and could change on a regular basis since the storage market has been lucrative since I started paying attention in 2000.  However, Virtual SAN focuses on proactive alignment of storage capabilities with a VM's storage policy.

Traditional storage is built on basic capabilities or service levels providing availability, performance, and capacity.  Availability could be local or remote (RAID and block replication for example).  Performance could be tied to spindle count in a RAID set, amount of read or write cache, auto-tiering with flash.  Capacity can be thick or thin-provisioned.  And to get the correct and corresponding SLA was (and still is) a perpetual balancing act of enterprise storage admins.  Should this database go on RAID5 or RAID10?  What happens when the tablespaces outgrow their LUN(s)?  And so on...

Auto-tiering for performance is getting closer to the problem, however, there are still reactive policies that govern the migration of blocks between tiers.  You can place a workload on a datastore and hope that those policies react fast enough to satisfy the performance requirements and at the same time are efficient enough to make the most out of a starved flash resource.  With the rush of All-Flash Arrays (AFAs), obviously there is a market for not doing auto-tiering at all.  Why bother with policies to manage tiers if you only have the highest tier to work with?  Much simpler with an AFA and even though I believe that not all data is flash-worthy, simple as a design principle should never be discounted.

So to circle back, Virtual SAN doesn't ask for RAID configuration or LUN carving of a storage pool. By creating storage policies and assigning to VMs, now those VMs inherit the benefits of the storage pool accordingly and now availability, performance, and capacity can be proactively set per policy, as well as reacting dynamically to the workload without the traditional management headaches.  I plan on going into detail on the use cases such as scale-out applications, Big Data ,perhaps redundant as you could just as easily describe as "scale-out data", test/dev, virtual storage appliance replacement, and as a DR target.

vHPC and vHadoop

Last year I presented at VMworld US and Europe around virtualizing high performance computing with UCSF.  This year with the latest improvements to vSphere 5.5 and vCAC 6.0, I have my work cut out for me proving the flexibility and performance of using virtualization for HPC-type applications.  Early on in my career I worked at Argonne National Labs outside of Chicago and since college I have had a passion for distributed systems in computer science.  Except now I typically approach from a systems engineering perspective instead of as a programmer.  But everything I'm working on lately has programming attached, or to be more accurate event-driven scripting and slogging through APIs.

In addition, I've spent the past two years working on virtualizing Hadoop and helping customers who have been early adopters.  Automation and virtualization arguably go together very well and scaling with flexibility is key.  Even Hadoop, to me, is a bit of a misnomer these days as I think in terms of MapReduce and HDFS.  And then Hive, or Drill, or Giraph, and layers upon layers.  Sidenote: when I think about layers, I think about Shrek, and then that makes me think of "I am Legend".  So much potential for optimization of the compute and data layers, independently or tightly coupled.  Looking forward to working on several of these use-cases this year as well.


I started at VMware focused on business continuity and disaster recovery.  If you looked through my virtual customer whiteboards, you would see thousands of DR plans and Site Recovery Manager drawings.  I haven't given up on SRM and am pleasantly surprised by the number of customers building out stretched clusters, active/active applications across regions with tools like Gemfire, and pushing the boundaries of what and how they can abstract site services with PaaS.

Again, really looking forward to 2014 and working on lots of opportunities.

No comments:

Post a Comment