The Case for data protection – Tuesday’s ransomware attack

hacker, malware, ransomware
Image courtesy of photouta at

The second reported attack of NSA-esque ransomeware this Tuesday should not surprise any systems administrators or IT staff. These attacks are happening on an increasing basis, and with the release of the “Vault 7” documents as a how-to-for-hackers, they will only increase. Google hacker culture, Vault 7 or script-kiddies. Suffice it to say, that dangers like this are a growing concern that needs to be addressed in your Data Protection plan.

Data Protection Plans

Getting back to the basics of Data Protection, todays’ article will discuss how backups as a part of your DP program, can help with ransomeware attacks. Backups may bring up visions of hurricanes or tornadoes, but it goes well beyond that. Data protection also means, well, protecting your data. From all the threats out there, including accidentally deleted files and not so accidentally deleted files, or even ransomed files.

So, you may be asking, how does data protection actually protect me from ransomeware? To put it simply, ransomeware doesn’t remove your data and your files, like a tornado, hard drive crash, or hurricane. It removes YOUR ACCESS to that data and files.  Time and mathematics instead of wind, rain, and lightening are denying access to your data. The files are still there, but you can’t use them to do what you need to do.


This is the case for backups. We previously discussed the several use cases for snapshots in an article, but in this instance, any backup will do as long as the backups were taken BEFORE the systems became infected with ransomeware.

To this point, you should have a backup SCHEDULE. That means that you don’t just keep the latest copy of your backup, your keep staggered copies of your backups. One of the most famous backup schemes is Grandfather-Father-Son backups. While the scope of your backup schedule is beyond this article, suffice it to say that you should have at least one month of good backups if you have to restore data. With many of the backup appliances on the market these days, this is taken care of for you. And with compression and deduplication technologies the amount of data that can be stored on-site or remotely is truly astounding.

This solution is not perfect, but better than paying someone to release the data that you generated in the first place. Or maybe not – maybe the hackers will sell you an enterprise license? Good Data Protection policies deal with many ways to keep your data YOUR data. This includes making sure you can access it.

In the grand scheme of things, this rash of WannaCry-type ransomeware attacks will continue. While security companies are rapidly working to cut down these attacks, if your data protection isn’t cutting the mustard these attacks will be terrible for your ability to support the other departments of your company. It is time to have a discussion with management about your data protection strategy and how these attacks affect it. Like they say “Life is tough, but it’s tougher if you are stupid.”

Share this:

The SMB IT Guy’s guide to ensuring your success from Hyperconvergence.

Image courtesy of Stuart Miles at

By now, everyone realizes the advantages of server virtualization. Flexibility in the face of rapidly changing technology, reduction in administrative effort on busy IT staff, and cost savings from reducing physical machines is just the beginning. As you may have heard, hyperconverged infrastructure solutions offer all of these advantages, plus the added benefit of simplicity in your environment.

This article is targeted towards small to mid-sized business: 50 to 500 employees supported with 1-5 or so staffers in the IT department. These IT shops don’t rely on specialists, but a few really good “jack-of-all-trades”.  If you are looking for a way to bring this up with the boss, make sure to see the article written for the senior directors or owners in the business here.

So – there are a lot of different hyperconverged vendors out there and a lot of solutions. If you believe the literature and web demos, they will all do everything you need in your environment. How do you know which is the best and what to look out for?

As with everything in life, the answer is – It depends. No one can answer the question of which is best for you, without the intimate knowledge of your environment which probably only you have. What I can do is provide you with some questions that you might want to ask the various solutions providers. These may help you determine which solution works best for your organization, and that management will buy off on.

Here are 5 questions to help you in your inquiry.

What comes in the box?

Well, not literally, but what does this solution entail? How many servers of MINE will this solution cover, and how much extra capacity will I have? Are there any extras that might later cost me money or maintenance fees? Are installation services needed and possibly included in this solution? Is high availability between hardware units included in this quote? The answers to these questions may not make or break the solution for you, but you should know what you are getting for your money. You need to be able to present this effectively to management so no one gets any unpleasant surprises later. Maybe you only need a barebones system right now. That’s fine, but make sure that you know what is included and what everyone’s expectations are.


There are a few main solutions out there and they all handle this differently. Many manufacturers of these solutions OEM hypervisors, so ask how that affects the cost of your unit(s). Is there the possibility of having to purchase additional software licenses in order to expand? Are all of the management consoles and utilities provided under the license of the hyperconverged product? If not, what isn’t included that I may want, and where can I get it? Do I need to deal with the hyperconverged manufacturer, or do I have to drag another vendor into this? How many vendors are involved in this solution and who do I call if I need support? Are there different tiers in the number of licenses? What do my maintenance costs look like 3 and 5 years out? If my server count grows by 20% per year, what additional costs will I encounter? Most solutions providers will be more than happy to work these numbers for you, and your management will love your forward thinking “strategic planning”.

Simplicity and Ease of Use

Hyperconverged infrastructure solutions are all about making things simple, right? Find out. Get to know how this particular solution works. You don’t need to see the actual code, but it might be nice to know conceptually how everything fits together. Does this solution come with any training? Is training required? Is training an extra cost? Are basic functions like setting up virtual machines, virtual disks, and virtual NICs intuitive? What about more advanced tasks? That pesky application that we have that demands VLAN tagging, how does this solution support that? Can I do every task I need to do from the management interface? How easy is this product to use for non-pre-sales-engineers-that-don’t-work-for-the-manufacturer?

Backup, Recovery, and Failover

OK – we are looking at this solution because recovery and business continuity are supposedly made much easier with this. Can I stop dropping by the office after hours and on weekends to do silly little server tasks, like rebooting crashed boxes… for payroll… at the end of the month? How does this solution help me with recovery tasks? How does it handle a crashed server? How does the solution handle network failures, disk failures, or whole server failures? Can I SEE it demonstrated live? How will this solution affect my existing backup strategy? Will my current backup solution work, or does this solution include something that replaces it? Does it do native snapshots? How many? Will it replicate those snapshots somewhere automagically? How can my existing DR plan be improved with this solution?


Everyone has a constantly changing environment. How does this solution handle growth and changing needs? What does it take to add 20% capacity to this solution? How much does it cost, and how easy is it to do? Will I have to stop production or do it at 3am? Do I need additional chassis to do this, or can I upgrade the units internally? Will this require downtime? What if I want to start moving things to the edge of my infrastructure? How flexible is this product? Do I have the ability to add more memory, CPU, or plain disk to this solution independent of purchasing the next model? What is the roadmap for this product line – Flash disk, software, and NIC speeds?

Hyperconverged infrastructure promises to be an amazing step in the IT virtualization lifecycle. There are different capabilities and features in all of the various solutions. You just need to ask a few questions to figure out which one is right for you. Not just right for you right now, but right for you in 3 to 5 years. Only after you can answer the questions above will you be able to enjoy the REAL benefits of simplicity that hyperconvergence provides.

Share this:

Snapshots – Everyday Uses and Hacks

Storage Snapshot
Image courtesy of ddpavumba at

Creating snapshots in a storage environment is an amazing technology.  The ability to take an instant “picture” of a data volume is a tool that is used in a variety of ways.  It makes your job easier and more manageable.  It can help secure your environment.

Different vendors implement snapshots in various ways, but the general theory remains the same. An almost instantaneous copy of data that may be moved and manipulated by a system administrator.  The theory of this is nice, but how can we USE this functionality.  Can it make their job easier and protect their systems from the everyday issues they see “in the wild”?

With organizations I work with, we see many innovative uses of snapshotting technology.  There are amazing examples of real world IT organizations making their jobs faster, easier, and much less stressful.  In other words, they used “business hacks” to make their snapshots work for them. We will discuss five real world ways to use snapshots that are relevant and guaranteed applicable to your everyday work load.

Snapshots in your DR strategy

The first things that pops into most people’s mind is backups and disaster recovery.  Snapshots produce an exact copy of virtual machines or data volumes that is stored within the storage appliances.  Most vendors allow these snapshots to be replicated or moved to another storage appliance.  This allows you to use an appliance in another location as a disaster recovery site.  Or, it is possible to mount these snapshots as volumes and allow your backup server to incorporate these exact replicas of data into your existing backup or Disaster Recovery plan.

There are several advantages to this approach.  The data in a snapshot is an exact replica in time, so it is easy to manage RPO and RTO.  Also, this approach takes the data backup “offline” of your production servers.  Sure, the network and storage are still involved in transferring this data, but the data transfers happen out-of-band.  This reduces slow systems and lag.  Many vendors now include APIs for cloud storage in their software and storage appliances.  Now, you may back up your snapshots directly to cloud storage.

Update “insurance” snapshots

We’ve all done it.  Installed that patch from our system or software vendor and it breaks the box.  Perhaps breaks is a strong word.  It temporarily overwhelms our system with new features and benefits. While snapshots can’t make the process of ironing out an ornery system update any easier, it can provide you with insurance.

By taking a snapshot before you update a system, you have an exact copy that you know works.  Suppose you cannot straighten out all the goodness that was Big-Name-Accounting-Package 5.0 before Monday 8am rolls around.  Now you have the ability to fail-back to your old system while you continue to straighten out the misbehaving upgrade.  Almost a form of version control for those of you familiar with the software development world.  This nifty trick also works on desktops.  If you are using VDI, make copies of your desktop images and use the same concept.  It may not save you time getting to the next version, but it will certainly save your bacon as far as system up-time and help-desk calls are concerned.

Gold copy snapshots

If you are making snapshots of servers before you upgrade, you are probably already doing this, but we will mention it anyway.  Snapshots are amazing tools for creating new servers, virtual machines, or desktops.

Once you have installed an operating system and all the various patches and utilities that you routinely use – take a snapshot.  Now this new, untouched system as-pure-as-the-driven-snow will be the starting point for all new servers or desktops that you implement.  This is often referred to as the “Gold copy“, a reference to software development and when code is ready to ship out to customers.

This “Gold copy” has standard system configurations already in place, various drive mappings, and config files.  It is all in there.  Sure you may edit some things like network and licensing, but you have a starting place that is pretty solid.  In the future, if you need to make changes then just make changes and save as a new snapshot.  This may not seem like much, but anyone who has built a new system from scratch will tell you that this is a genuine lifesaver.

This concept applies to both virtual machines and stand-alone servers or desktops.  Several customers we work with will use an application to “ghost” images from storage appliances to a new non-virtualized server or desktop.  Mount the snapshot you would like to use as your system image, then transfer it over to your new hardware using the disk image utility of your choice.  Of course, this works best in a virtualized environment, but it is also a valuable tool for the not-yet-virtualized.  By the way, why aren’t you virtualized yet?

Instant data set snapshots

We regularly hear from customers asking how to generate test data for new systems testing.  In several cases, systems administration is tasked with creating data sets that the consultants or systems specialists can use to ensure the systems are working as anticipated.

Instead of this being a problem, use the best test data that there is – an exact copy of your current, live data.  There is no need to create new data sets from your existing data. By creating a snapshot of your current databases, you may test systems with what was once hot and live data.  But, there is no negative impact if this data is corrupted or destroyed.  You can even create multiple copies of this data to use across multiple tests.

Getting around malware with snapshots

Today’s data environment can be a pretty scary place.  Look no further than the headlines to see

Malware, virus, spyware
Image courtesy of Stuart Miles at

stories about malware and ransomware wrecking havoc on organizations.  If the recent exploits of the bad guys is any indication, things are getting much larger in scope.  The WannaCry attack is still fresh in everyone’s minds and is rumored to have effected over 230,000 machines world-wide. It is safe to say that there are external threats to your data that can be remediated with snapshots.

A schedule of snapshots  on your storage appliance is the solution.  Whether this is part of your disaster recovery planning or not, set up a schedule. This concept is similar to the “patch insurance” we discussed above.

By making a number of snapshots over time, we are able to go back to former snapshots and explore these snapshots for malware.   Perhaps we may extract data from our snapshots before the encryption activates.  Of course, data sometimes is lost.  It is up to management to decide to pay faceless hackers for your data or try to recover it via backups and snapshots.

Snapshots have been in the storage technology tool bag for a while.  The technology has matured so that most storage array vendors are offering this functionality.  Over years of working with clients, we have discovered many innovative ways that people are using snapshots.  In this article, I have shared what I have seen, but I am interested in what you are doing with your snapshots.  Feel free to share and let everyone know how they can use snapshots within their storage appliance.


Share this:

The SMB Owner’s Guide to Ensuring Your Success with Hyperconvergence.

SMB manager owner CIO executive
Image courtesy of imagerymajestic at

Hyperconvergence is the newest IT architecture that is removing both cost and complexity from virtualization infrastructure. This article assumes you are aware of the advantages of hyperconvergence and how it applies to the business end of your small to medium business. What we are going to discuss is how to ensure that you are getting the TRUE advantages from Hyperconvergence over what all those fancy marketing papers say you can.

A small to medium business(SMB) doesn’t mean just a tiny kiosk in the mall that only has a single POS computer. We’re talking about SMB in terms of between 50 to 500 employees with an IT staff of up to 5 full or part-time staffers.

There are a lot of claims out there around hyperconvergence technologies. At the top of the list is reducing costs. Also, it claims to be a simpler environment for your IT staff – increasing productivity. As the business owner, what questions do you need to ask to ensure that your hard-earned capital is well spent?

Among all the claims, there are 5 things that you need to look for in a hyperconverged solution to ensure that your solution brings everything to your business that it can.

Vendors in the solution

One of the claims of hyperconvergence is simplification of the solution. This is potentially achieved by eliminating the multitude of vendors that are part of a traditional virtualized solution. This solution involves how many vendors? Where do the individual responsibilities of each vendor start and stop? Will you need multiple support contracts, or is everything covered under one master contract? Is there a central support number to call, or is there the possibility of finger-pointing between various manufacturers? In this vein, is the solution the intellectual property of one company, or are there different licensing agreements in place? How could this affect YOUR investment in the event of a manufacturer bankruptcy?


The initial install of the solution is probably correctly sized for your business. What happens if you need to expand that installation? If you need more virtual servers, or to add more users, are there going to be any additional license fees (Vmware)? What about yearly maintenance fees, will those grow, too? What if we expand and I want to add virtual servers at another location? Are my licenses “tiered” or do they get more expensive for additional functionality or when I hit a certain license count? These are not necessarily deal-breakers, but fore-warned is fore-armed. It sure helps to have a reliable idea of licensing costs when budget time rolls around.


Hyperconverged solutions come in all shapes and sizes. Different solutions exist for a dozen virtualized servers, and for several hundred virtual servers. Whichever you have is not as important as the answer to the question Is the solution expandable? Does the solution have the ability to cover your business as it grows, without the dreaded “fork-lift upgrade”, which means downtime for the profit-centers of your business. In addition to this, if upgrades are possible, do they involve downtime?  Can your sales department sell while the upgrades occur?


Sure, everyone will be more than happy to install this beast once you have signed on the dotted line, but just how complex is that installation? Can we operate on the existing systems and minimize downtime while the installation occurs? How complex is the switchover to the new systems (Easily migrating VMs or data)? Can your IT staff shadow the installation? Is it easy enough that they can do it themselves with just a bit of guidance?  Can your staff expand the system, or will you need outside help?

Ease of Use

Now that we have it and everything is running, just how difficult is it to get my IT staff up to speed on the product? Is there additional training that will take my staff off site in order to learn how to use this product? Once I train my staff, am I in danger of losing them to a competitor willing to pay more for those certifications? When we add additional virtual servers to the environment, will my staff be able to do that? How difficult is it, and how long will it take? Since my staff isn’t as large as some of the big-guys, how difficult is it to cross-train?


Hyperconvergence is an amazing leap forward for IT virtualization. Correctly sized, designed, and implemented it promises a lot to the small to medium business. But like most things in life, one size doesn’t necessarily fit all. Spending money wisely requires due diligence. Make sure the business squeezes all of the value that you paid for from this solution. Address the questions around vendors, licensing, systems expandability, installation and ease of use.

Engage with the manufacturers and ask the solutions provider the next step questions addressed in this article. This will ensure that you enjoy the advantages advertised while getting the exact solution to benefit your business NOW.

Share this:

How Do I connect to my Storage Appliance?

Fiber Channel attached Storage Appliance
Image courtesy of cookie__cutter at

In this article, we are examining the third question asked in our original article The Beginners Guide to what to know before you shop for a Storage Appliance.  That question in a nutshell is “How do I intend to connect my storage so that all of my applications can get to it?”  Well, that question begs a good look at your current environment.  Based on what you find, we will determine if you should connect to your existing environment or connect through other dedicated technologies within your existing environment.  There are also other less common methods to connect to your storage.

Using Existing network infrastructure

Is your network stable?  Every network administrator or sysadmin knows who the problem children are in their network.  Do you have any segments or switches in your environment that are currently congested or causing delays now?  Adding storage to it will only exacerbate the problem.  On the flip side of that coin, a well-running network makes adding storage easy and inexpensive.

In addition, the speed of your existing network will come into play.  Depending on your current storage needs, I would recommend that no one attach storage at speeds of less than 1 Gigabit Ethernet. As 10 GigE becomes more affordable and more pervasive in networks, it is never a bad idea to increase bandwidth to your storage.  Fortunately, many manufacturers enable upgrading with field replaceable units.  Speak with the vendor about this ability in the units you are investigating.

Most storage appliances will support a variety of connection protocols.  For storage area networks (SAN), it is important that iSCSI be supported in the unit.  iSCSI will support most of the externally mounted volumes or LUNs (Logical Unit Number).  For Network Attached Storage (NAS), NFS is a popular way of attaching storage for most virtualization shared storage and *nix computing.  These storage protocols may all be supported, or only some of them.  SMB/CIFS should be supported for full functionality in a Microsoft network.

Using Dedicated connection technologies

There are situations where the use of the existing network may not be advisable.  If the network is older or slow, putting the data needs of shared storage on the network will just exacerbate an already slow situation.  In this case, there are dedicated connection technologies that may come to the rescue.

Ethernet connectivity is still a very viable alternative, using dedicated switches and VLANs.  VLANs are Virtual Local Area Networks that allow for the logical partitioning of ports within a switch to create virtual switches and LANs.  This lets you segregate data traffic and dedicate resources to the various ports that may be passing your data traffic.

Fiber Channel (FC) is a mature, well established connection technology.  FC uses glass fibers to connect physical servers to physical storage using light.  While this technology is a bit more expensive than traditional ethernet switches it does have advantages over ethernet.  There is tremendous support for this protocol in software and hardware because it is a very stable protocol developed specifically for storage.  Fiber Channel allows for data to be consistently delivered with very low overhead.  Fiber Channel switches are available to connect servers to storage in a logical mesh setup, but it is also a regular practice to directly connect servers with FC Host Bus Adapters (HBA’s – think of an HBA as a fiber channel version of a network card).  This will cut out the expense of a Fiber Channel switch for smaller deployments.

Exotic Connection methods

In addition to the well established protocols of Fiber Channel and iSCSI, there are other ways to connect storage.  There are storage appliances out there that will allow connection to servers via specialized technologies like InfiniBand, or SAS ports.  There is eSATA that is available.  These various ways to connect range from the super fast (InfiniBand – and expensive by the way) to the fairly common and slow.  “Exotic” connection technologies serve special cases and are outside the scope of this article.  These connection technologies will limit your field of vendors, but not disqualify you from a storage appliance.

Considerations of Connectivity

In addition to the connection methods discussed above, there are also other connectivity possibilities to consider.  Bonded connections is one.  Bonded connections make multiple physical paths (read cables or ports) to appear as one logical path to data.  In essence, two 1GB Ethernet connections becomes one logical 2 GB Ethernet connection.  A single path of bandwidth to the storage appliance will be quickly overwhelmed.  There will be many servers and users trying to connect to the storage.  Bonding allows several ports to simultaneously send out data.  Bonding also helps with failover.

Another consideration of connectivity is failover.  Although it may not happen often, if a cable, NIC, or port fails on the storage appliance or on the connectivity side, all servers using that storage are suddenly unable to access data.  Or all of your virtual machines may come down at once.  You have placed all of your proverbial eggs in that one proverbial basket.  Failover mitigates this risk accordingly.

This is often accomplished through the use of different controllers or “heads”.  Two (or more) controllers allows for multiple disparate paths to the data.  It allows for one head to crash and you still have access to your data.  It allows for one power supply to fail, and you still have access to your data.  Many manufacturers will vary on how they support this functionality, so it is important to research this carefully.  Make sure that the storage appliance will run on one power supply.  Verify that the controller heads support failover.  Implement bonded connections.


In this article, we have discussed the final question raised in our original article about finding the best storage appliance for your environment.  We have gone over considerations of attaching the shared storage to your existing network, the prospect of attaching the NAS or SAN via new connectivity, or even attaching via a special, non-standard or exotic connectivity mode.  Many vendors support these differing connectivity methods.  Specialized connectivity will limit the number of storage appliances that you have to choose from. Most users know that they are required from the start and can plan accordingly.




Share this:

How Will I use my Storage Appliance?

Servers and Applications attached to storage appliance
Image courtesy of cookie__cutter at

We previously discussed doing a storage study for your environment.  This article continues after you’ve done that study and have those numbers to help you in determining what you need in a storage appliance.  In this article we will go into the use scenario for your environment.  In essence, “How will you use this storage appliance?”  What applications will be attached and how many users will be on these applications?  How will that affect what I need in a storage appliance?  This article is designed as a starting point for the novice user, not the storage expert.  It will make you ask the right questions for your environment so that you can find the right answers to get the best solutions for your needs.

So, to determine how we will use this appliance, we need to take stock of our environment.  Don’t worry, this is not as in depth as the storage study.  As a matter of fact, you probably already know most of this information just from administering your environment.  It is just a matter of collecting all of this information in one place and using it to project how you are currently using your environment, and how you plan to use a storage appliance in the future.

Numbers of servers, applications, and users.

Probably the single most important consideration of the storage environment is how many.  This applies to how many servers, applications, and users will be regularly using this storage.  Of course, a storage appliance that regularly supplies data to hundreds of users will have different speed and space requirements than appliances that may be used by only a few users.  The Google and Facebook storage environments will amaze you.  So to start, we need a pretty good estimate of how many physical and/or virtual servers may be attaching to this storage and how many users will be accessing data on it as well.

It stands to reason that a mail server that is supporting a large company will need more storage resources than a server that is supporting a dozen.  More users means more space and more speed.  This should spill over into every aspect of your environment.  The larger applications with more users will need more speed and probably more space.

If all of your applications are inward facing, then your work on this part is almost done.  Many companies, however, also host data or applications for outside users as part of their business model.  Maybe it is as simple as an ordering system that a few trusted customers are allowed access to, or it might be as complicated as you company hosting data as your business model.  Either way, it is important to count outside customers in the numbers that you will use to determine storage requirements.  And those customers may be the most important of any that you have.

Future Growth

Also important, although it is not our primary concern, is future growth.  This includes anything that will grow the amount of storage, like acquisitions.  The current space and number of users will tell us where our storage appliance needs to be NOW.  Several items in the storage study will show us how large we are growing with current users and applications.  Future growth of employees and business units will give us a look into how much larger we may need to grow outside of our regular growth numbers.  Because almost nothing gets smaller, right?

What applications are you using?

The type of application that you plan to run in conjunction with your storage appliance matters, and there are two primary types of access.  The speed of access is important to applications like databases.  The amount of storage is important to applications like file shares and home directories.  Please note that these two types of applications are NOT mutually exclusive.  Traditional applications will use a combination of both.  A pure inventory database is probably running very lean and wants speed.  Especially if it is serving records out to multiple sites or users.  I have never met a DBA that doesn’t want more speed and then even more.  But that database may reference a document imaging system that contains large files.  Or it may have BLOBs inside of it.  These things will increase the amount of space needed, but also require that objects be accessible in a reasonable amount of time.

Do a site survey of the applications and their types in your environment.  It is important to keep in mind that databases are everywhere.  In the traditional applications, but also in your mail application.  In special applications that may be specific in your business.  And CERTAINLY in most business intelligence applications that management may be using.

Is or will virtualization be in this environment?

You may be using virtualization in your environment and are looking to add shared storage.  Or you may be looking to virtualize and want to “do it right” by adding a storage appliance right off the bat.  Neither way is wrong and both can apply to this decision.  A well done storage study includes either the servers that are already virtualized or the servers that you will virtualize.

As a small aside, remember that Aristotle said “Nature abhors a vacuum”.  This is how it applies to you. Only the physically unique servers will not be virtualized once you see how great virtualization is.  I refer to servers with physical hardware that cannot be virtualized.  Like a fax server with special cards, or a huge database server that is clustered for performance or availability.

I mentioned storage space above, and that is an important consideration.  Virtualization makes your physical environment much more efficient.  With additional storage space, there is always the temptation to build more.  More servers, more drives, more home directories with cute downloaded pictures of kittens and recipes.  This is not an “if” question, it will happen.  Since you are virtualized, every manager’s wish list of applications comes true.


In addition to virtualization of servers, there is always the virtualization of desktops, laptops, and portals. The end users in your business.  VDI is an extension of server virtualization technologies and is making serious inroads into businesses large and small. The advantages make it easy to see why.  While planning actual storage requirements for VDI is outside the scope of this document, it is a consideration.  If you are planning to add VDI into your environment, then now is the time to start planning.  You will need a fair amount of capacity and speed depending on the number of users you plan to support.

If you are not planning to add it right now, then at least consider the ramifications that it could have to your storage environment.  New storage appliances are usually a significant purchase.  Plan on how to expand space and speed capacity on the unit you wish.


In a previous article, we discussed things that you should look for before deciding on a storage appliance that is applicable for your environment.  In this article, we went over the second of the information gathering exercises – How you intend to use your appliance.  What your current environment entails as far as users and applications, how those applications access data, and the presence of virtualization or VDI in your environment are all important questions to answer.  In the next article, we will look at how best to connect your storage appliance to your existing network.  Do we use existing infrastructure, or will we be adding the newest and fastest tech out there?  Tune in next week!

Share this:

The Storage Study – or How do I determine what my environment is using?

Storage Study
© Ultramarine5 | Dreamstime Stock Photos & Stock Free Images

In a previous article, we discussed three important questions to answer about YOUR environment before jumping into a storage appliance.  In this article, we will delve deeper into the first question we asked, “How fast and how much storage do you need?”.  This article is designed for the IT generalist, someone who is looking for some insight on how to do one.

So – how do I tell what I need?  The first step is to do a storage study.  The storage study is done in your environment for a period of around seven days.  Why seven days?  Because that will capture an entire work week of your environment.  And by work week, I mean those weekends that systems guys work and backups run on as well.  Is Saturday a full-backup day?  You want to see what the impact is on your systems.  Perhaps accounting prepares reports for payroll on Wednesdays.  Usually, a seven day sampling of your storage needs will account for standard practices within your environment, and not create information stores that are massive in size.

If you would like to capture more days than seven, break it out into multiple capture files of seven days.  Perhaps doing multiple sampling weeks during significant system events would reveal more details about your environment.  End of the quarter accounting processing?  Start of a new production cycle?  You decide.

The storage study should include several important take-aways collated and also broken out by host or server.  These four important metrics are IOP/s, Latency, Storage Footprint, and information on new (or “hot”) data.  We will delve a bit deeper into what each of these means below.

Input/output OPerations per Second (IOP/s)

What is an IOP, and what does it mean to my environment?  IOP/s simply put are a measurement of how many storage operations your host is doing every second.  IOP/s can be misleading, though.  While a single read operation generally takes 1 IOP, writes to disk can use up to 6 IOP/s for the same bit of information.  Why this happens is a bit technical, so your relevant question should be “How do I account for this?”

In addition to the overall IOP/s number, most studies will include a read vs. write percentage. This is usually written as 65/35, or 65% reads across this study and 35% writes.  This percentage determines how exactly to account for the IOP/s that were collected.  Of primary importance to the IOP/s study, though is the IOP/s over time.  This will help determine when the busy parts of the day are.  You should see numbers for absolute peak (meaning that this was the largest IOP/s event during the entire sampling period), and several percentiles.

The 85th percentile number is what is usually used to determine how to size your system.  You can certainly size your system to accommodate your peak IOP/s, but usually this is more appliance than you really need.  It follows the same logic of building a house that is above a 500 year flood plain.  Sure, your house won’t get flooded out (statistically speaking) for 500 years, but will the house even be standing by then?


OK – we know about how many IOP/s our systems are using in the course of our storage study.  Now, how long is it taking those IOP/s to be serviced?  In essence, your systems are issuing commands to your storage, but how long does it take your storage to complete the command?  Is that number acceptable?  Milliseconds are the usual time.  Lower is better.

Peak and trending latency are important.  If peak latency reaches 100ms, there is cause to investigate further.  Most applications are tolerant of high latency.  High latency is noticed in database record access times, or the spinning wheel/hourglass of uncommitted data.  It can be a bit tricky to run down exactly where the slowdown may be occurring.  Our primary concern with this storage study is that it is NOT happening along the disk I/O path.  Common culprits are slower disks, inadequate system RAM, and older CPUs.

If you start to see this number trending up or if you see spikes during the day, this is indicative of concerns in your system. While your disk storage may not be the bottleneck, we would like to be able to disqualify it.  Your planned storage appliance should be sized to accommodate any extra load.

Overall Storage footprint

Overall footprint is straightforward.  How much stuff do you have stored on all of your systems?  You will see this represented both by the server and also the entire environment that you collected.  This is often represented by a total amount of space in the environment – all the space on all the hard drives.  The amount of used v. free space is important.  This lets you know how much of all the spinning disks that you are have filled with your data.  A small amount of data on a fast, expensive disk or disks is not cost effective.

If you conduct multiple storage studies, compare the amount of used space from one study to another. This will give you an idea of how quickly your environment is growing.  Most of the storage study tools out there will collect information on each disk individually.  This allows you to drill down to the application level.  Find those greedy disk hogging applications quickly.

This metric will help you to determine how much overall space you should put into your storage appliance.

“Hot Data”

Hot data is data that is accessed, changed, or newly written by your systems within the storage study collection period.  In essence, this is the data that your applications used during the study.  All other data is not accessed, touched, or read during this time, but may be necessary to keep.  This hot data contains clues into how much your overall data needs may be growing every week.

Hot Data also answers the question of how fast your storage needs to be.  Writing data puts more of a strain on a system than reading data.  Hence, we need a faster system the more writing we do.  Hot data also gives us a rough estimate of what new data was written on the system.  This allows us to extrapolate what your storage needs will look like in a quarter or a year given your current rate of growth.

One important aspect that hot data drives is the speed of the storage appliance.  The higher the percentage of hot data the faster the storage appliance needs to be.  The larger the overall amount of hot data, the faster the storage appliance needs to be.  These are important considerations in correctly sizing storage appliances for both size and speed.

Accurate growth rates allow us to properly size the overall storage capacity of a storage appliance.  No one wants to buy too little space right off the bat.  But it is also pertinent that we not buy too much storage at the onset of the project.  Storage prices go down every year as capacity goes up.  It is financially cheaper to only buy the storage when you need it, not buy it all up front.


A storage study is the first step in determining your needs in a storage appliance.  This report generates many of the metrics that are required to correctly size a storage appliance.  The numbers generated will give us ideas of how much disk space overall we need, and how fast that disk space needs to be.

We have discussed IOP/s, Latency, Storage Footprint, and information on new (or “hot”) data in this article.  Once you have collected these metrics, analyze them.  Next, let’s see how various applications affect our storage needs.

Share this:

The Beginners Guide to what to know before you shop for a Storage Appliance.

Checklist for storage appliance
Answer these questions to determine your best fit storage appliance.

This is a basic guide, meant for the jack-of-all-trades, not the storage professional. There is a dizzying array of storage appliances on the market.  Appliances with a list of features that would make a Swiss-Army knife blush. We could argue the pros and cons of specific features all day, but this article is more focused on the IT generalist.  More to the thinking, “Will this storage appliance help my business?” As such, we will discuss three initial questions to answer before starting the process of determining the best storage appliance for your environment.

These three questions are 1) How fast and how much storage do I need, 2) For what purpose do I need this storage, and 3) What technology will I use to attach this storage to my environment?

How much, how fast?

So – How much Storage Appliance do you need, and how fast does it need to be? In simple terms, it is time to do your homework. You can collect data on your network using a multitude of tools.  The common term for this work is a storage study or survey. In this study, you or an IT engineer will usually activate utilities that are already on your various systems.  These utilities collect information about how you use your storage on your various boxes over a period of time. As far as the sampling period goes, the longer the better with about 7 days being a happy medium.

Depending on how many systems you have at your location, the collation of this data can be a bit of a beast.  There are several vendors that have tools to help collate this data, and they are both free to download and worth the effort.  Here is a sample report of a personal favorite that supports individual systems as well as VMWare. Click Here to download a copy to run in your environment.

This collection of data will give you insight into how much data storage your environment uses, how often that data changes, and how fast your systems are trying to write data to that storage. This is fairly equivalent to horsepower and gas mileage in a car. There are several subtle nuances that will become important later in the decision process, but this collection will get you the information to start.  It is possible to put together a good picture of your environment, once you get this information.  Data show where you are and allows some assumptions as to how you may use data in the future. You will run into terms like Input/Output Operations a second (IOP/s), latency, and queued operations. Save the questions for later, we have more work to do.

How will I use my storage appliance?

How you intend to use this storage is an important factor in what storage is best for you. Is this storage for a single application like a database, or do you plan to share this storage across several servers and allow them to each use a portion of it? This may determine the sort of connectivity that you need. In addition to the number of applications that may be using this storage, what are those application doing? Databases tend to need more structured access to data and the faster the storage is, generally the better the database will perform. If you have several people on the database at the same time, this may become a factor. Also, for general storage, speed isn’t always the issue as much as perhaps usable space.

Is virtualization a factor in your storage decision? To achieve a truly virtualized environment, there must be shared storage between physically separate hardware servers. If you are virtualized, or plan to, this can be a factor in the type of storage that you choose. If you have considered VDI (virtual desktop infrastructure – or the “virtualization” of the desktops in your environment), then that will most definitely be a consideration in your decision.

How do I attach?

How do you plan to connect your storage to the hungry applications that need it? Unlike traditional direct attached storage, a SAN or NAS doesn’t just clip into your server using existing slots in the chassis. There is a bit of planning involved.

The most common ways to connect NAS and SAN devices is using your existing network. No problem if your network is serving your needs well.  If not – then there is some planning needed into how to best get your applications connected to the data that they crave. Fortunately, the numbers that we collected in the storage study will give you some insight into this. There are alternate ways to connect the storage, if your network isn’t cutting the mustard. Some are best for databases, some are designed to allow low-cost access and growth, and some are hugely expensive but fast as blazes – definitely used for specialized applications. Just collect your thoughts on this and we can use it in a further discussion.

Once you have this information collected, you are well on your way to the having the information that you need. Having done a storage study, determining how you intend to use your SAN, and determining how you will connect servers to the storage will start you in your journey to find the correct the SAN or NAS for you.

We will address each of these questions in more depth in future articles, so stay tuned.

Share this:

A Quick Discussion on Disaster Recovery Planning

I got a call from a customer not long ago asking me questions about Disaster Recovery planning. Now we’ve all developed DR plans, and a quick search on the interwebs will get even the novice started on the basics, but there were three things that we went over that dredged up old memories and I thought I would share them here. Those things are prioritizing your servers (or functions), doing the math, and asking somebody to help.

A Place for Everything and Everything has a Place

When I worked as a systems engineer at a large company, we developed a divisional DR plan. After a lot of busy work thinking we needed to recover everything now and not getting that to happen without a project budget we could denominate in gold bars, we recognized that not every server or business function needed to be up immediately. There was a method to recovering systems, and we decided to group everything into three classes with different RPO and RTO objectives.

The most important systems were classified as “A” systems, and were the first to be recovered. These were the business critical boxes that needed to be up ASAP. Systems that directly related to and directly impacted the business lines and areas of the business that were visible to the customer.

When the Class “A” were completed, or well underway, we could focus on the class “B” systems. They were not as business critical and not as important as the class “A” systems, but were important to the business. Internal systems that the business needed to run, but were not immediately visible to the public, or systems that could wait until the Class “A” boxes were up.

The last class was the Class “C” systems. These needed up, but had the longest RTO and the greatest RPO of the bunch. The “nice to have” systems. By categorizing our recovery, we could get the important stuff done first, and then work on the rest.

Do the Math is a simple concept

Just like your professors told you back in school, do and show your work. Go ahead and run those storage studies so you know how much you will be recovering. Do a growth test and see what the “delta” (information change) is on your local systems on a daily and weekly basis. Then plan around that. Run the numbers, build a spreadsheet. Are your WAN connections big enough to replicate your data within your time allowance? How often can you make snapshots with the available space on the SAN or NAS? Do you have enough hardware to recover what you plan to recover at the site? Something as simple as can you read your tapes (Boy, am I dating myself!)?

Another Do the Math concept is activate your plan with a limited scope. Nothing will show you where your plan is weak like trying to recover a small amount of data. It doesn’t have to be a full-on test, but activate your plan for a single server. Send someone over to the DR site and have them try to recover last night’s email server, or the HR system. Or only a small portion of the system. Pick 100 records to restore – just enough to tell you where your plan needs more work. And where you can improve.

If your company is big enough to have one, invite the audit department to tag along. Nothing impresses the audit folks and regulators (if you are in that line of business) like testing your plan and working to improve it. Nothing is perfect the first time around or even the seventh, so do the math to improve it.

Ask somebody. But not just anybody

The Beatles were on to something there. There are people in your organization that can help you out. When we classified the systems in the organization into different classes, we didn’t just pick those systems at random. We asked for help. The IT Department sent out questionnaires to department heads and had them rank the systems that we had identified for importance and impact. We also asked for any systems that we might have missed and were not on the list. You would be amazed at the systems WE thought were important versus the systems the BUSINESS thought were important.

Remember that your DR plan is not an end product. It is designed to let the IT assets of your company help recover the business lines of your company.  Of course, information is vital to your company, but how long will you be in business if the widgets don’t get made? If Accounting needs the company chat system to be up first, then the chat system needs to be up first. And no matter what anyone says, email is a Class “A” system. If management doesn’t believe that, turn it off for an hour and see how the phones light up.

Nothing that I have said in this article is rocket science, it is just a few lessons learned from building a plan, and then working to test it. Technology changes, and thus the tools used to implement your plan over time will vary, but the fundamentals of prioritizing your servers, doing the math, and involving the business lines for help still remain pertinent today.

Share this: