ORDMAP KEY SIZE
#35247
18 May 22 05:41 PM
|
Joined: Jun 2001
Posts: 430
Valli Information Systems
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 430 |
Is the limit 511 characters for the 'key' of the following?
DIMX $TEST,ORDMAP(VARSTR;VARSTR) MAP1 S,S,511,"TESTING" S = S+SPACE(512) $TEST(S) = "YES" ? $TEST(S) END
RUN X "YES"
change lenght of S to S,512
RUN X
<null>
thanks
|
|
|
Re: ORDMAP KEY SIZE
[Re: Valli Information Systems]
#35248
18 May 22 05:53 PM
|
Joined: Jun 2001
Posts: 11,925
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,925 |
Indeed it is. It seemed reasonable to use a fixed size static buffer, assuming that keys would be of "reasonable size". The limit could easily be increased, but if you want to eliminate the limit entirely, it would require a bit more overhead. Probably worth it you have some exotic use for extremely large keys. But before going down that path I'd be interested to know a bit more about the application.
|
|
|
Re: ORDMAP KEY SIZE
[Re: Valli Information Systems]
#35249
18 May 22 06:00 PM
|
Joined: Jun 2001
Posts: 430
Valli Information Systems
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 430 |
in this case i was running a sequential file looking for duplicate entries, line length up to 4096, would really like it increased to that if possible as i use these extensively and never occurred to me that there was a limit.
thanks
|
|
|
Re: ORDMAP KEY SIZE
[Re: Valli Information Systems]
#35252
18 May 22 10:26 PM
|
Joined: Jun 2001
Posts: 11,925
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,925 |
Ok, I think the limit has been removed entirely in 6.5.1716.2. Here are -el7 versions ... ash-6.5.1716.2-el7-upd.tzash-6.5.1716.2-el7-efs-upd.tz(If you're still on -el6, it may take another day or two. My CentOS 6 virtual machine appears to have some kind of network configuration breakdown which I need to debug first. Might be a good incentive to move forward since its EOL was back in 2020.)
|
|
|
Re: ORDMAP KEY SIZE
[Re: Valli Information Systems]
#35254
19 May 22 05:18 AM
|
Joined: Nov 2006
Posts: 2,262
Stephen Funkhouser
Member
|
Member
Joined: Nov 2006
Posts: 2,262 |
You could hash the value to use for the ormap string. Any duplicate hashes would collide the same as the original value. If you use a combination of DJB+ELF with XCALL HASH the key would be 16 bytes. If you need better sensitivity in the hash to prevent false positive matches you could use MD5 which is still only 32 bytes.
Stephen Funkhouser Diversified Data Solutions
|
|
|
Re: ORDMAP KEY SIZE
[Re: Valli Information Systems]
#35255
19 May 22 08:30 AM
|
Joined: Jun 2001
Posts: 11,925
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,925 |
XCALL HASH is an excellent idea for detecting duplicate lines in a file. Other than having to worry about the false positive possibility, it requires only one extra line of code in your loop (for the XCALL). And although the likelihood of a false positive increases more or less linearly with the number of lines in your file, so would the memory overhead of using the entire line as the map key. I suspect that by the time false positives became a problem, you might have first run out of memory. As for how to check whether a duplicate hash is a false positive or not, you could of course store the line as the value part of the key-value pair, but again, if your file is huge, that would contradict the goal (however old-fashioned it may be!) of not being excessively wasteful of memory. Another idea would be to use MX_FILEPOS to get the position of each line and use that as the value of each key-value pair. Then when you find a duplicate key, you can seek back to the position of the earlier line to re-read it and compare to the current one.
do while eof(ch) # 1
xcall MIAMEX, MX_FILEPOS, ch, MXOP_GET, curpos, status ! get current file position
input line #ch, pline$
xcall HASH, 1, pline$, len(pline$), hash$, status
if not .isnull($map(hash$)) then ! if key already exists...
oldpos = $map(hash$) ! store line hash -> line position
xcall MIAMEX, MX_FILEPOS, ch, MXOP_SET, oldpos, status ! seek back to earlier line
input line #ch, oldpline$ ! read the potentially duplicate line
xcall MIAMEX, MX_FILEPOS, ch, MXOP_SET, curpos, status ! restore start of current line
input line #ch, dummy ! and jump to end of it (our actual current pos)
if oldpline$ = pline$ then ! see if it really is a duplicate
? "Duplicate lines at position ";oldpos;" and ";curpos
repeat
endif
endif
$map(hash$) = curpos ! else store this line in map
loop
In any case, I'll see if I can fix my CentOS 6 system today.
|
|
|
|
|