Please note that this series of articles is not based on any scientific fact sheet and I do not claim that my view of things is the correct one. Also I am not coercing anyone into anything. I am merely sharing my view on setting up a development environment. You are invited to disagree and debate it, but please do that in a civilized manner. I want no flame wars. :)
We usually talk a lot about code and tools we use, to make that code and maintain it. We talk a lot about different OS and cross platform and how all that impacts our code. We talk about code-bases and portability, we talk about code formatting and code quality. We talk about patterns and algorithms and data structures.
We talk about a lot of things, but rarely I see that we talk about our technical environments which enables us to develop code in the first place. And environments go beyond that. They store our personal data, host our operating systems, stream media and do a lot of other things not directly related to development. But if there is no environment there is no development. And good environment is critical to good, fast, efficient and quality development. If the environment is not optimized then you will spend more time developing because you are preoccupied with small and larger tasks in order to develop, that you are not supposed to do in a properly set environment. Good environment is there without you even knowing it, it does not get in your way and on the other hand keeps your data safe and enables you to work in a higher gear.
I wanted to write about this for a long time. I will make a series of articles about my environment because I feel it has matured over the years, into something that is a very good environment for serious development. Writing about it I can present it in a fashion that also covers the general approach to setting up a good environment.
If I first look at the development environment as a collection of logical parts then I would describe it as:
- Storage units
- Processing units
- Display units
- I/O units
- Networking units
- Supporting HW
Maybe I could include something more or leave something out. But from the developers perspective this list should cover the major parts. Lets say your environment is a minimal one. This could be a single computer and so I could then assign the HW parts to the logical parts from the list:
- Storage units: HDD
- Processing units: The CPU
- Display units: One or more monitors
- I/O units: Keyboard, mices, optical drives, USB devices etc…
- Networking units: The ethernet card
- Supporting HW: Maybe just a printer, or nothing
I will start with the Storage units because I feel it can be one of the most diverse and complicated parts and on the other hand it is very important for development and for other parts of the computing environment. It holds all the permanent data (By storage I meant hard drives, solid state drives, any permanent storage. This does not include RAM and other volatile storage).
Your storage sytem can be a simple HDD in your computer. But not only that is not convenient, as I will show later, it is also not safe if you not backup it regularly, or you do not have a RAID 1 array or something similar done. I intend to talk about NAS here and why it is so good, that you should always use it, if you have a chance.
Network Attached Storage
Hard drives and computer parts in particular are not reliable. I have seen to many people rely blindly on their hard drives not even once thinking they can fail. And they do fail. And they failed in some of that cases resulting in years of work and personal data lost. I have seen an executive in a big firm loosing 7 years of data because in 7 years he never made a single backup. Sad, but true. But that is not the point of this article. Developers know that and they do backups.
YOU DO BACKUP YOUR DATA DON’T YOU?
The point here is NAS storage. Wikipedia says this:
“Network-attached storage (NAS) is file-level computer data storage connected to a computer network providing data access to a heterogeneous group of clients. NAS not only operates as a file server, but is specialized for this task either by its hardware, software, or configuration of those elements. NAS is often manufactured as a computer appliance – a specialized computer built from the ground up for storing and serving files – rather than simply a general purpose computer being used for the role.“
The advantages I see in NAS are:
- One point where all the data from all your systems is stored
- Centralized storage means easier backups and easier security and safety of data
- Different operating systems can access it seamlessly over SAMBA
- You can abstract your data away from your computers meaning you can make your data “portable”
Cons are in my view:
- NAS is single point of failure if there is no redundancy (and in home environments there usually is none)
- You need some expertise to set it up yourself if you do not use premade NAS boxes
- Even on gigabit network there is slower data transfer then from local drives
But in my mind there is no doubt, pros are so big, and I talk from practice, that they far outweighs the cons. And if you do backups then you still have data if the main NAS fails.
A good NAS setup
I would like to present in my opinion a good NAS setup consisting of two NAS servers. You can substitute one server with remote location storage or with local external drive or something similar. Important thing here is that there are no definitive correct solutions. There are better and worse solutions, each solution has its cons and pros but there may be many viable solutions. This not only applies to NAS but to the development environment as a whole. So do not take my words as sacred, or as the only correct setup. Take them just for what they are. Only one of the possible solutions, but that I happen to use
I have two physical boxes for storage. Here I could replace them with virtual boxes, but I choose not to for two reasons
- I want good NAS performance and the VM server I use already has enough VMs use for other purposes
- Those two boxes need to be as far apart as possible in case of some disaster. We want one of them to survive.
Those two boxes are nothing special. One is custom made with a celeron and 12 GB RAM (because I use ZFS and it eats RAM like candy) and a bunch of drives, with 3 of them in ZFS (equivalent of RAID 5). So this box has the exact setup like this
- Dual core celeron
- Cheap motherboard
- 12 GB RAM
- 3x 1.5 TB drives in ZFS
- 1x 320 GB drive for sharing, torrents and other temporary stuff
- 8GB USB strick for FreeNAS setup (the OS)
The other box is a HP microserver bought recently:
and the interior
It is a turion based server with 2GB ram and 4 HD slots. I then put in 4x 3TB drives and made 2x RAID 1 array out of that 4 drives which nets a total of 6TB. Now you may ask what the heck I need all that space for and why 2 servers. Well the first server is my primary NAS, that Is why a lot of RAM. It is is faster then the HP. It also has RAID 5 which is fast for reads but is not so reliable. If your HD RAID fails or your SW RAID fails you may loose the data. It happened and it will happen again. You cannot rely on your RAID 5 to be bullet proof.
While the first server is a NAS workhorse with all the active data, the second one (HP), is slower and has less RAM. But it uses less power, is quieter and has way more storage. This one is for archiving and for the timeline incremental backups of active data. For each there is 3TB space available. And while you think that that is way to much free space I can tell you I have a lot of data, also weekly whole VM backups take a lot of space. Storage is cheap these days and the need for space will only increase. This way I am covered for the near future. Also I work from home, so this is basically my office setup and home setup in one. Additionally my wife’s parents also use the same NAS for backups and data.
It may seem an overkill but in my opinion it is not. You can make NAS server very cheaply and if you pay attention it will consume very little power. For me data safety has no price, more so in these days, that everything is moving to digital distribution. Movies, music, books, mails, IM history, code, photos etc…
I would like to point out that there are a lot of combinations, by which you can set this up (all assume that you have a primary NAS).
- online cloud backup for backup system
- external HD drive for backup system
- backup system is an additional drives array, in the same NAS box
- VM as a secondary backup system
I think I was long enough for the first post in the series. In the next one I will look into actual software and solutions that I use for those two servers and the HW that uses them. I hope I told something new or interesting at least for some of you out there. Developers can be a tough audience. Anyway I think it is good to talk a little broader from time to time. If you have questions about this article or wishes for the next ones, please do drop a comment.