Throw Your Isilon in the Data Lake
November 16, 2016Customers have a ton of requirements around log aggregation, file shares, media streaming repositories, and just a simple place to store objects. It can be difficult to manage all of these different use cases but Dell EMC Isilon might just be the solution that can help to manage these requirements. Many times customers have several small islands of storage used for different purposes. Maybe this is because of a brand new requirement like “all security camera data will be stored for seven years”, which might require some additional storage space. Whatever the reason, companies many times will have small islands of storage, possibly even from different storage companies. This can become tough to manage and require more storage administrators with differing skill sets.
Isilon may be a good solution to aggregate all of these storage islands into a single scalable solution that can handle many different types of workloads. Throwing all of your data into a single pool is often described as a “Data Lake” due to the volume of the pool and the amount of stuff stored on it.
What is an Isilon?
Dell EMC Isilon provides a single namespace for a giant pool of storage. A single Isilon cluster can scale out to 144 nodes in the cluster and a total of about 60 Petabytes of data. Each node of the cluster provides additional performance and capacity to the overall Isilon cluster. One of the neat things about the solution is that you can add a different node type to the cluster where some nodes can provide more performance and some are better for dense capacity. You aren’t constrained by the first node type that you chose.
An Isilon cluster is designed to ingest a bunch of different types of workloads all at the same time and to be able to do this for an enterprise there has to be a variety of protocols that can be used to add data to it. Currently, Isilon can provide the following protocols which give you tons of flexibility:
- SMB v3
- HTTP
- REST
- SWIFT
- HDFS
- NDMP
- NFS v4
- FTP
What Workloads do I Use on Isilon?
During Tech Field Day 12, David Noy (VP Isilon Product Strategy) was very clear about what types of workloads are best suited for an Isilon. Streaming media, file shares, home directories, log files, analytical data, etc and just about any type of file level data. David was clear to mention that you can also use an Isilon for storing virtual machines and backups but probably not the best fit for the product. Dell EMC obviously has other technologies such as XtremIO / VMAX/ VNX for virtual machines and Avamar / Data Domain for backup related tasks which are much better suited for those workloads.
All travel expenses and incidentals were paid for by Gestalt IT to attend Tech Field Day 12. In addition, Dell EMC provided a gift to all delegates but with no expectations about the coverage through this blog or social media.
Isilon Features
If you’re going to be storing Petabytes of data, then you better be able to manage security and reliability of those bits. Isilon is a full featured enterprise solution which can provide capabilities such as snapshots through SnapshotIQ, replication through SyncIQ, dedupe through SmartDedupe and various other “Smart” products.
Each of the nodes in the cluster can provide additional tiering for caching of the data. To provide additional performance, caching mechanisms can store data for read performance of hot data. Any data that hasn’t been accessed recently will be pushed down to spinning disk to allow for more recently used data. An additional feature has even been added so that you can archive the data off to public cloud storage on AWS or Azure.
A big feature for financial and health care customers is the “Smart Lock” feature which allows you to prevent files from being changed. This is useful to protect sensitive data from attacks from worms or other malicious activity. Again, very important for an enterprise solution that is housing this much data. It must be protected and managed well. And of course, the directory tree housed by the Isilon can be managed through role-based access control (RBAC) which is a requirement at this point for almost all storage arrays. If you’re very concerned about security you can also use self-encrypting drives for storing encrypted data at rest.
Administrators will have access to analytics of the data being stored on the cluster to effectively manage the data being added to the Isilon. With the amount of data being stored on the solution, being able to identify which applications are hot spots and chewing up the data or resources is a must have to make sure the cluster is being used effectively and to manage future capacity needs.
Try it yourself
Yeah, it’s an enterprise product, but if you want to go get your hands on it, the Isilon team provides a free version of the Edge SD product which is a Isilon virtual appliance typically used for remote sites. The free version lacks a few features but you can get the look and feel of the solution and store up to 36 Tb of data for your home lab if you wish, just don’t use it for production purposes for fear of violating the EULA. http://www.emc.com/products-solutions/trial-software-download/isilonsd-edge.htm?PID=STORE-DL-ISD
[…] Eric Shanks – Throw Your Isilon in the Data Lake […]