Previous Thread
Next Thread
Print Thread
linux performance question #36287 24 Jun 23 01:29 PM
Joined: Jun 2001
Posts: 425
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 425
i have a directory with about 50k files in it, i execute the following command and launching ashell jobs slow down until basically they will not launch, current running jobs in ashell appear to be acting correctly. i have launched this using host command within ashell, outside ashell , and makes no difference. jobs running outside ashell do not appear to affected so any suggestions on how i can trouble shoot this? it has actually been a problem with other disk intensive linux commands.

thanks in advance

for f in ./temp/*.pdf; do zip -j test.zip "$f"; done

Re: linux performance question [Re: Valli Information Systems] #36288 24 Jun 23 01:32 PM
Joined: Jun 2001
Posts: 425
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 425
also this directory ./temp is a sub directory of one of my ashell ppn's, could that be something?

Re: linux performance question [Re: Valli Information Systems] #36289 24 Jun 23 06:58 PM
Joined: Jun 2001
Posts: 11,794
J
Jack McGregor Offline
Member
Offline
Member
J
Joined: Jun 2001
Posts: 11,794
This kind of problem is typical (at least in my experience) with directories containing more than, say, 10K files. (The lengths of the file names may also be a factor.) In any case, 50K is way too big for efficiency. It doesn't slow everything down, but any operation that needs to locate a file in that directory will have to perform potentially many hundreds or even thousands of individual disk seeks because way the directory inodes are linked together. Some file systems may be better than others at this than others, and the cache efficiency may vary, but it's just a bad situation. Definitely worth some effort to break the directory up into sub-directories.

If the reason for the large number of files is that the directory is accumulating files (such as print files or other temp files), such that the vast majority of the accesses are either to add a file to access a recently added file, then probably you can just create an archive sub-directory and move all the files that are over a certain number of days old to it. That way the main working directory remains a reasonable size, yet you can still easily get at the older files by just descending one or more levels.

As an example, I have sites that archive every print file and spreadsheet ever created. We just create an archival structure something like this...

/xxx/reports - current
/xxx/reports/2022 - files created in 2022
/xxx/reports/2021 - etc.

Depending on the number of files, we might go another level deep, down to the month. We then just run a crontab script periodically that scans the main directory and moves the applicable files down a level.

Another possible factor would be if the directory is shared -- that will also create a lot of additional overhead. Unless these files have to be shared across the network, it would be better to move the directory outside the scope of sharing.

Along those same lines, as mentioned previously, some file systems are better than others at handling a lot of files. If it's not practical to break the directory up into multiple directories, perhaps it may be worth creating a separate filesystem just for these files. Here's an article on Stack Exchange that contains some relevant benchmarks for different file systems that might be of interest.

Re: linux performance question [Re: Valli Information Systems] #36290 24 Jun 23 08:06 PM
Joined: Jun 2001
Posts: 425
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 425
this was the example today, i can also recreate the same problem by taking a large pdf file, running it thru ghostscript and outputting another file. the question is why does it only affect launching ashell, i can run multiple 'zip' operations and they run just fine, plus if i zip these files thru my samba shares, all is good

Re: linux performance question [Re: Valli Information Systems] #36291 24 Jun 23 09:04 PM
Joined: Jun 2001
Posts: 11,794
J
Jack McGregor Offline
Member
Offline
Member
J
Joined: Jun 2001
Posts: 11,794
Launching ashell may involve multiple operations depending on your startup command line. One way to narrow it down would be to add TRACE=EXEC to the miame.ini, which will trace each LIT command executed. Since each trace is time-stamped, that might at least reveal whether the long delay occurs before the very first ashell trace, or after the first LIT command, or ...?

To go even deeper, you could use the Linux strace tool. If you don't have it, you can probably install it with yum install strace. With that, you can then generate a low-level trace of all system commands executed by ashell, via a command line such as:
Code
$ strace -tt -o ashell.trc ashell -n log dsk0:150,277:

The trace file (ashell.trc in this case) will look something like this...

13:48:41.613837 execve("/vm/repo/65/65core/bin/ashell", ["ashell", "-n", "log", "150,277"], [/* 55 vars */]) = 0
13:48:41.614090 brk(0) = 0x83df000
13:48:41.614142 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff76ef000
13:48:41.614185 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
13:48:41.614234 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
13:48:41.614262 fstat64(3, {st_mode=S_IFREG|0644, st_size=153441, ...}) = 0
13:48:41.614285 mmap2(NULL, 153441, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfffffffff76c9000
13:48:41.614303 close(3) = 0
13:48:41.614333 open("/lib/libncurses.so.5", O_RDONLY|O_CLOEXEC) = 3
13:48:41.614355 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0pO\0\0004\0\0\0"..., 512) = 512
...
13:48:41.648141 stat64("/vm/miame/dsk0/001004/log.lit", {st_mode=S_IFREG|0666, st_size=17370, ...}) = 0
13:48:41.648219 open("/vm/miame/dsk0/001004/log.lit", O_RDONLY|O_LARGEFILE) = 6
13:48:41.648242 stat64("/vm/miame/dsk0/001004/log.lit", {st_mode=S_IFREG|0666, st_size=17370, ...}) = 0
13:48:41.648263 _llseek(6, 0, [17370], SEEK_END) = 0
13:48:41.648278 _llseek(6, 0, [0], SEEK_SET) = 0
13:48:41.648293 time(NULL) = 1687639721
...

Maybe that will reveal something. In my example above, the entire sequence of launching and logging to a directory took all of 0.035 seconds, but there were no large directories to scan in the operation of loading and executing LOG.LIT.


Moderated by  Jack McGregor, Ty Griffin 

Powered by UBB.threads™ PHP Forum Software 7.7.3