filesystems - Lots of small files or a couple huge ones? -


In terms of performance and efficiency, it is better to use very few files (by many I mean something equal Million) or a couple (ten or very) huge (many gigabytes) files? Let's just say that I'm building a database (not completely accurate, but all these cases is that it is being delivered to a lot).

I'm concerned primarily with the performance of my file system currently on Linux 3x3 (Ubuntu server version if this is the case), although I am in a situation where I still switch So, the comparison between the different file systems would be great. For technical reasons I can not use a real DBMS (hence the question) for this, so "just use MySQL" is not a good answer.

Thank you in advance, and tell me what I want to be more specific.


Edit: I am going to store small pieces of data, so using very few files will be easy. So if I went with using some big files, then I'm only retrieving some of them KB at a time. I am also using an index, so there is really no problem. Besides, some data points to other pieces of data (this will point the file in a case of many small files, and large files In the location of the file's data) There are many assumptions here, but for all intents and purposes, a large file is too small.

Assume that you are looking for the string of text contained in the text file. Find faster 1,000,000 MB files to open and search by 1TB file .

Each file-open operation takes time to open a large file only once.

And, considering the disk display , a file is more likely to be stored by accident than a large range of files.

... then, this is generalization without knowing more about your specific application.

Enjoy,

Robert C. Cartano


Comments