I am the author of the linked FAQ, and I just wanted to clarify on
question number 14. The answer intended for developers who find that their programs does not work with files larger than 2GB by default. But using larger files on 32bit Linux is no problem as long as you just have the
right defines in the begining of the program before any includes.
This is all aimed at developers. What this means for users is, that the most important programs on 32 bit Linux does work with large files by now. There is a lot of programs that rarely touch files larger than a few MB. And for those programs, you might never notice whether they support large files or not.
As a user you may run into a program which is typically used for files in the 100MB to 1GB range and be surprised to find it not working with files of 2GB or more. In that case you could blame the developer of the program, or you could blame the developers of glibc for not making large files supported by default. A much more productive approach would be to download the source, add a few defines in the right places, compile and thoroughly test the program on large files.
Thus as a user even if you need to work with files larger than 4GB, you might be able to do so on a 32 bit machine. However there are cases where you might still benefit from a 64 bit machine. If you have more than 896MB of RAM, memory management in IA32 Linux starts getting tricky. Older kernels didn't support it at all. But eventhough 2.4 and later can handle more RAM, you don't get full benefit of the RAM above 896MB. If you have 2GB or less, you probably shouldn't worry about this. If you have 4GB or more you definitely should worry. If you have that much RAM a 32 bit architecture will be too limited.
There is another reason why you might want a 64 bit architecture (even if you have less than 1GB of RAM). That is the virtual address space. Even if you don't have that much physical RAM, you might still have a need for that much virtual address space. There is a number of reasons why your virtual address space might need to be significantly larger than your physical RAM. If you have enough swap, a process might simply use more. In some cases the most efficient way to solve a problem might involve mapping the same physical RAM on a few different virtual addresses. Because of fragmentation you might not be able to use all of the avilable address space. Finally a program might have a need to mmap large files, in which case you will need address space for that. Combine the four, and a few GB of address space will not seem like a lot. There are
tweaks which will help you get the most out of a limited address space. But in the end they will only buy you a little, and if your data grows, you might end up needing a 64 bit architecture anyway.
As a developer you should be able to figure out if your applications need more than the 3GB of address space they can typically get on a 32 bit architecture. As a user you might want to trust what the developer of the software tells you. If there is a particular piece of software you need to use to process large amounts of data, and the documentation doesn't make any recomendations about 32 vs. 64 bit architecture, I think you should go ahead and ask the developer what is recommended. (The answer might be that the application can benefit from a 64 bit architecture, but has received the most testing on a 32 bit architecture. In that case you will have to ask yourself whether you want to be the guniea pig and possibly gain the possibility to work with larger amounts of data. Or if you want to use a 32 bit architecture and hope you don't need to work data too large to be handled).