T-SQL Tuesday #139 – Handling Hybrid Data

Introduction

This month’s T-SQL Tuesday is being hosted by Ben Weissman (blog|twitter). The subject is Hybrid Data Estates and how people are managing them. As Ben describes it, “I would therefore like to invite you all to share your hybrid and edge experiences“. This post will be my contribution for T-SQL Tuesday #139 – Handling Hybrid Data.

This is a pretty wide subject area, that gives everyone plenty of possibilities for blog topics. Just writing a useful blog post, and getting quality links back and forth from other blogs in the SQL Server community is good for everyone who participates!

BTW, reading and then commenting on the T-SQL Tuesday blog posts of other participants helps both you and the other blogger. They get views and comments on their blog, and you get backlinks to your blog. T-SQL Tuesday also helps encourage people to blog more often, which is good.

In my case, I am going to talk about some of my experiences with handling personal hybrid data.

T-SQL Tuesday #139 – Handling Hybrid Data
T-SQL Tuesday #139

T-SQL Tuesday #139 – Handling Hybrid Data

Two recent conversations on Twitter give me the idea for this post. The first was someone asking for recommendations for a network attached storage (NAS) device. There were many responses that ran the gamut from various commercial NAS devices to people scoffing at the idea of a NAS.

A second conversation, was a comment to a post of mine about an affordable multi-gig Ethernet switch for home use. The commenter basically stated that most people do not have multiple computers at home, and they don’t have any wired Ethernet.

These two seemingly unrelated conversations made me think about my assumptions about what hardware and infrastructure people typically have in their homes. If you are a data professional with a special interest in hardware, it is all too easy to assume that other people in the same profession will make many of the same choices you have. Obviously, this is not going to be true, for many reasons.

There is a Spectrum of Situations and Choices

Thin Client

One end of the spectrum is someone who has purposely minimized their personal computer hardware. Perhaps they have a work laptop, supplied by their employer, and a personal laptop that they bought themselves. Maybe their personal laptop is even something like a Chromebook.

They have an older combination Wi-Fi router/cable modem, supplied by their ISP. They live in an apartment or an older house, so they don’t have any wired Ethernet ports in the walls. Because of this, they use Wi-Fi for all of their connectivity in their home. Since they don’t want to deal with computer hardware, nearly all of their data is stored in the cloud. They also purposely use cloud compute resources as much as possible.

Fat Client

Another end of this spectrum is someone who has multiple desktop and laptop computers in their home. Perhaps there are multiple family members, who each have their own machine. Some members of the family may have more than one computer, used for different purposes. If one or more people in the family are data professionals, maybe someone has a dedicated computer lab.

This family lives in a newish house that does have wired Ethernet. At least in the United States, many houses that have been built in the last 10-15 years have some low voltage wiring capability with Cat 5 or better Ethernet and RG6 Coax drops in many rooms. This family uses their wired Ethernet connectivity as much as possible. They also have a high-end Wi-Fi 6 router and a high-end DOCSIS 3.1 cable modem that they own. This family also has a dedicated commercial NAS that they use for a local copy of their shared data. They may or may not also store data in the cloud.

Thin Client Issues and Advantages

The thin client family is completely dependant on their ISP for connectivity and bandwidth. Depending on their ISP, this may be no problem or it could be a significant problem. Some ISPs are great for reliability and some have frequent issues. In many parts of the United States, ISPs have local monopolies, so you may not have any other viable choices.

Depending on where you live and what technology your ISP is using, you might have limited bandwidth. You also might have asymmetrical bandwidth, where you have much higher download bandwidth compared to your upload bandwidth. Some ISPs also have monthly total bandwidth caps.

T-SQL Tuesday #139 – Handling Hybrid Data
Comcast SpeedTest Result

Wi-Fi Bandwidth and Security

Only having Wi-Fi for local connectivity can also be a problem. Even the latest official Wi-Fi standard (Wi-Fi 6) does not have as much raw throughput under typical real-world conditions as 1Gbps wired Ethernet. Most Wi-Fi 6 standard devices can only do about 700-800 Mbps on the 5 GHz band when the devices are in the same room. In real life, Wi-Fi 6 usually gets more like 50-60 Mbps on the 2.4 GHz band. Wi-Fi 5 and older will do less than this. Of course, if your ISP bandwidth is lower than your local Wi-Fi bandwidth, that will be your real bottleneck for many scenarios.

If you live in a high density residential area (an apartment or high density detached housing), you can have other Wi-Fi issues. Wi-Fi congestion and Wi-Fi security can be an issue, especially with older Wi-Fi standard devices. Even in a single family home on a multi-acre lot, you may have Wi-Fi congestion issues with many devices sharing your router. You probably have a number of non-computer devices sharing your Wi-Fi capacity.

Being “Thin Client” does have some advantages though. There is very little hardware to purchase, maintain and eventually replace. You can easily take your laptop anywhere else that has internet connectivity, and have nearly the same experience that you would at home. If you want to move somewhere else, there is less “stuff” to pack and move. Your purposely limited hardware infrastructure uses less space and less electricity in your home.

Fat Client Issues and Advantages

The fat client family is still highly dependant on their ISP for connectivity and bandwidth. But, since they have significant local compute and storage capability, they can still function in a degraded manner with a temporary loss of internet connectivity. Just like the thin client family, the bottleneck that is outside their control is their internet bandwidth. No matter how much network bandwidth they have locally, their internet bandwidth depends on the ISP.

The fat client family has a lot more computer-related hardware to purchase, maintain, and eventually replace. Many people have absolutely no interest in being a computer support technician. Having your own personal I.T. infrastructure in your home takes up space and uses more electricity. If your home suffers a natural or man-made disaster, you could lose all your equipment and all of your local data.

The fat client family does have some important advantages. They have significant local compute and storage capacity, and much better local network bandwidth. With wired Ethernet, they can take much of the load off of their local Wi-Fi network. Even 1Gbps wired Ethernet lets you move local data around much faster than with local Wi-Fi. Multi-Gig Ethernet (2.5, 5.0 or 10Gbps) is finally becoming more affordable and widespread. This lets you move local data around much, much faster. With Multi-Gig Ethernet, your bottleneck often moves from your network to your storage.

With a NAS, you can have a local copy of your data. This can be supplemented with cloud storage for some or all of your data. A decent NAS will have enough compute and memory to run things like piHole, Docker, and small VMs. It can also supply disk resources for VMs hosted on different local hardware.

You can buy or build local client machines that are optimized for a specific workload, such as gaming, development, rendering, etc. Once you have that local hardware, it is yours to do with as you wish. There are no ongoing charges for compute or storage usage. As long as you have electricity, all of your local resources will be available.

So What is the Correct Choice?

Given the subject of this post, the answer should be obvious. A hybrid approach that uses the best parts of thin client and fat client can let you tailor your capabilities to your situation and budget.

For example, the extreme thin client family might make a few small changes that could improve their capability and redundancy. All of this extra equipment would cost perhaps $400-$500 and fit in one medium sized moving box.

Thin Client Changes

  • Buy a new Wi-Fi 6 router instead of renting a mediocre router from your ISP
  • Buy a new DOCSIS 3.1 cable modem (or non-cable equilvalent) instead of renting from your ISP
  • Connect anything you can to the wired Ethernet ports on your WiFi router
    • Perhaps your work or personal laptop
    • This reduces your Wi-Fi load
  • Buy an external USB 3.0/3.1 hard drive enclosure and drive(s) and connect it to your Wi-Fi router
  • Get a small UPS to protect your Wi-Fi router, cable modem and external storage

Fat Client Changes

The extreme fat client family could also make a few changes that might improve their experience and reduce their costs.

  • Make an effort to maintain the machines and infrastructure under your control
    • Things like software and firmware updates are important for reliability and security
  • Turn off machines that you are not using
    • This will reduce electrical usage and noise
  • Configure your machines to reduce power usage and noise
    • You can pick the appropriate power plan and do things like customize the fan curves
  • Consider reducing how many machines you have
    • This will free up space and reduce clutter
  • Use cloud storage to supplement your local storage
    • This gives you an off-site copy of some or all of your data
  • Make sure you understand all of the capabilities of your NAS
    • Modern NAS devices from Synology and QNAP can do many things besides store files
    • You can have your NAS sync data with a cloud service
    • You can have your NAS automatically backup client machines

What Do I Do?

Because of my historical and current usage pattern, I am much more on the fat client side of things. I have a YouTube channel where I film, edit and render 4K video. This makes me think that I “need” a powerful local workstation so I can edit and render more quickly.

I also have a dedicated gaming machine that has a direct 2.5 Gbps connection to a 2.5 Gbps LAN port on my router. This reduces my latency by a small amount, since I need all the help I can get playing online games!

I have a lot of large video files that I store on an older eight-bay Synology DS1817+ NAS. The NAS has eight 6TB Seagate Iron Wolf Pro drives in RAID 5.

T-SQL Tuesday #139 – Handling Hybrid Data

The NAS has a 10GbE NIC that is connected to an eight-port NetGear XS708E 10GbE switch. There is another 10-port Netgear MS510TX Multi-Gig switch connected to that.

T-SQL Tuesday #139 – Handling Hybrid Data

I have several machines with 2.5GbE or 10GbE NICs connected to these two switches. I also have an ASUS AX-GT11000 Wi-Fi 6 router connected to an older ASUS RT-AC86U Wi-Fi 5 router using aiMesh mode. This gives me better WiFi coverage in my garage.

I have a number of things that I would like to improve with this setup. I’ve been eyeing the newer Synology DS1821+ NAS. This would let me have M.2 SSD caching in the NAS to reduce the current storage bottleneck reading and writing from the NAS. The newer NAS has a much more powerful CPU and double the RAM capacity compared to the DS1817+

T-SQL Tuesday #139 – Handling Hybrid Data

My current Comcast internet service is highly asymmetrical. At our house, we get roughly 600Mbps for downloads (which is pretty good), but only about 18Mbps for uploads, which is pretty painful. I need to see if I have any other service options from Comcast. I am using DropBox and Windows OneDrive to store some data in the cloud, but I need to use the cloud more.

After a quick count, I discovered that I have thirteen working desktop machines and two laptops. I have enough components laying around to build at least 3-4 more desktop machines. This is a little excessive.

T-SQL Tuesday #139 – Handling Hybrid Data

You might be wondering how this relates to hybrid data estates? I think there are a lot of parallels and many similar thought processes between personal data/infrastructure and professional data/infrastructure. Many of the decision points are driven by the same considerations.

Personally, I don’t want to be 100% dependent on cloud resources and internet connectivity. Having some high performance local compute and storage is very useful for me. I also understand people who want to have the least amount of local infrastructure that they can get away with.

Final Words

If you do have some local infrastructure, please make an effort to properly maintain it! This will usually keep it running better and improve your security. Here are some relevant posts:

I hope that this post about T-SQL Tuesday #139 – Handling Hybrid Data has been interesting and useful! Finally, I also want to thank Ben for hosting this month.

If you have any questions about this post, please ask me here in the comments or on Twitter. I am pretty active on Twitter as GlennAlanBerry. Thanks for reading!

T-SQL Tuesday ,

3 thoughts on “T-SQL Tuesday #139 – Handling Hybrid Data

  1. Nice read Glenn

Leave a Reply

%d bloggers like this: