I’m doing Veeam Backup Sizing for Hyper-V Cluster recently, and one things that make me interested is on how the Block Size affecting the performance of Application
Please refer to the following study that I had done
Storage is just a bunch of One and Zeros and operating system doesn’t just read or write ones and zeros one by one, it combines them into units called blocks and then writes/reads them all at once.
These blocks / Cluster (in Microsoft Term) / Allocation Unit are then combined to create files and the Master File Table (MFT) – All information about a file, including its size, time and date stamps, permissions, and data content, is stored either in MFT entries
Understanding of MFT
When you format a volume with NTFS, Windows Server 2003 creates an MFT and metadata files on the partition. The MFT is a relational database that consists of rows of file records and columns of file attributes. Because the MFT stores information about itself, NTFS reserves the first 16 records of the MFT for metadata files (approximately 16 KB)
To prevent the MFT from becoming fragmented, NTFS reserves 12.5 percent of volume by default for exclusive use of the MFT. This space, known as the MFT zone, is not used to store data unless the remainder of the volume becomes full.
Relationship Between Cluster Size (4K Default in Windows) of File Size
If you’re writing a reasonably large file (say 100gb) in 4k blocks, that’s 26 million blocks that make up that file. Twenty-six million blocks just to keep up with for that one file. If it’s written it in 64k blocks, though, it’s now only 1.6 million blocks to keep up with for that one file.
I had prepared the following Diagram for ease of understanding on this concept
PowerShell to add a newly inserted disk and format it with 64K Cluster Size
#Add a new Disk into my VM, and check the disk number
#Bring the newly added disk online
Set-Disk -Number 1 -IsOffline $false
Initializes a RAW disk for first time use, enabling the disk to be formatted and used to store data.
-PartitionStyle GPT/MBR - Default to GPT
Initialize-Disk -Number 1 -PartitionStyle GPT
#Create Partition and assign Drive Letter E:
New-Partition -DiskNumber 1 -DriveLetter E -UseMaximumSize
#Format E: -AllocationUnitSize 65536 (Bytes) (64 * 1024K) - UseLargeFRS (For Veeam Backup Repository)
Format-Volume -DriveLetter E -AllocationUnitSize 65536 -FileSystem NTFS -NewFileSystemLabel "DATA" -UseLargeFRS
#Dos Command to check the Cluster Size
fsutil fsinfo ntfsinfo E:
#PowerShell to check the BlockSize / Cluster Size
Get-WmiObject -Class Win32_Volume | ? DriveLetter -EQ "E:" | Select-Object Caption, BlockSize