Previous Thread
Next Thread
Print Thread
ORDMAP KEY SIZE #35247 18 May 22 05:41 PM
Joined: Jun 2001
Posts: 430
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 430
Is the limit 511 characters for the 'key' of the following?

DIMX $TEST,ORDMAP(VARSTR;VARSTR)
MAP1 S,S,511,"TESTING"
S = S+SPACE(512)
$TEST(S) = "YES"
? $TEST(S)
END

RUN X
"YES"

change lenght of S to S,512

RUN X

<null>

thanks

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35248 18 May 22 05:53 PM
Joined: Jun 2001
Posts: 11,925
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,925
Indeed it is. It seemed reasonable to use a fixed size static buffer, assuming that keys would be of "reasonable size". The limit could easily be increased, but if you want to eliminate the limit entirely, it would require a bit more overhead. Probably worth it you have some exotic use for extremely large keys. But before going down that path I'd be interested to know a bit more about the application.

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35249 18 May 22 06:00 PM
Joined: Jun 2001
Posts: 430
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 430
in this case i was running a sequential file looking for duplicate entries, line length up to 4096, would really like it increased to that if possible as i use these extensively and never occurred to me that there was a limit.

thanks

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35250 18 May 22 06:02 PM
Joined: Jun 2001
Posts: 11,925
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,925
Ok, let me see what I can do. Hopefully you have a lot of RAM to spare!

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35251 18 May 22 06:05 PM
Joined: Jun 2001
Posts: 430
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 430
I think back to the alpha micro days and having 512kb was a dream.

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35252 18 May 22 10:26 PM
Joined: Jun 2001
Posts: 11,925
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,925
Ok, I think the limit has been removed entirely in 6.5.1716.2.

Here are -el7 versions ...

ash-6.5.1716.2-el7-upd.tz
ash-6.5.1716.2-el7-efs-upd.tz

(If you're still on -el6, it may take another day or two. My CentOS 6 virtual machine appears to have some kind of network configuration breakdown which I need to debug first. Might be a good incentive to move forward since its EOL was back in 2020.)

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35253 19 May 22 05:05 AM
Joined: Jun 2001
Posts: 430
V
Valli Information Systems Online Content OP
Member
OP Online Content
Member
V
Joined: Jun 2001
Posts: 430
thanks, unfortunately i still am on el6, another project to look forward to

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35254 19 May 22 05:18 AM
Joined: Nov 2006
Posts: 2,262
S
Stephen Funkhouser Online Content
Member
Online Content
Member
S
Joined: Nov 2006
Posts: 2,262
You could hash the value to use for the ormap string. Any duplicate hashes would collide the same as the original value. If you use a combination of DJB+ELF with XCALL HASH the key would be 16 bytes. If you need better sensitivity in the hash to prevent false positive matches you could use MD5 which is still only 32 bytes.


Stephen Funkhouser
Diversified Data Solutions
Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35255 19 May 22 08:30 AM
Joined: Jun 2001
Posts: 11,925
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,925
XCALL HASH is an excellent idea for detecting duplicate lines in a file. Other than having to worry about the false positive possibility, it requires only one extra line of code in your loop (for the XCALL). And although the likelihood of a false positive increases more or less linearly with the number of lines in your file, so would the memory overhead of using the entire line as the map key. I suspect that by the time false positives became a problem, you might have first run out of memory.

As for how to check whether a duplicate hash is a false positive or not, you could of course store the line as the value part of the key-value pair, but again, if your file is huge, that would contradict the goal (however old-fashioned it may be!) of not being excessively wasteful of memory.

Another idea would be to use MX_FILEPOS to get the position of each line and use that as the value of each key-value pair. Then when you find a duplicate key, you can seek back to the position of the earlier line to re-read it and compare to the current one.
Code

do while eof(ch) # 1
    xcall MIAMEX, MX_FILEPOS, ch, MXOP_GET, curpos, status       ! get current file position
    input line #ch, pline$
    xcall HASH, 1, pline$, len(pline$), hash$, status
    if not .isnull($map(hash$)) then                             ! if key already exists...
        oldpos = $map(hash$)                                     ! store line hash -> line position
        xcall MIAMEX, MX_FILEPOS, ch, MXOP_SET, oldpos, status   ! seek back to earlier line
        input line #ch, oldpline$                                ! read the potentially duplicate line
        xcall MIAMEX, MX_FILEPOS, ch, MXOP_SET, curpos, status   ! restore start of current line
        input line #ch, dummy                                    !  and jump to end of it (our actual current pos)
        if oldpline$ = pline$ then                               ! see if it really is a duplicate
            ? "Duplicate lines at position ";oldpos;" and ";curpos
            repeat
        endif
    endif
    $map(hash$) = curpos         ! else store this line in map
loop

In any case, I'll see if I can fix my CentOS 6 system today.

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35256 19 May 22 05:50 PM
Joined: Jun 2001
Posts: 11,925
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,925
Ok, here are -el6 updates supporting unlimited ordered-map key length...

ash-6.5.1716.3-el6-upd.tz
ash-6.5.1716.3-el6-efs-upd.tz
ash65notes.txt

Warning: this update requires a corresponding update to both of the ASQL connectors. (I don't think are using them, but for anyone who is, wait for those to be posted below in the near future.)

Re: ORDMAP KEY SIZE [Re: Valli Information Systems] #35260 20 May 22 08:42 AM
Joined: Jun 2001
Posts: 11,925
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,925
For what it's worth, here's an updated -el6 version of the ASQL connector for MySQL 8 that works with 6.5.1716.3 ...

libashmysql8.so.1.5.146.el6.tz

It can also be found in the ASQL download directory


Moderated by  Jack McGregor, Ty Griffin 

Powered by UBB.threads™ PHP Forum Software 7.7.3