Detail
When using shared memory (shmem), does the full amount requested by Aerospike get immediately allocated ?
For example :-
namespace test {
default-ttl 0
index-stage-size 128M
replication-factor 2
sindex-stage-size 128M
max-record-size 128K
storage-engine memory {
file /opt/aerospike/data/test1.dat
file /opt/aerospike/data/test2.dat
file /opt/aerospike/data/test3.dat
file /opt/aerospike/data/test4.dat
filesize 1G
}
}
If you look at the shared memory segments for this, here is what we see.
root@72test-1:/# ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0xae001100 13 root 666 134217728 1
0xad001000 14 root 666 1073741824 12
0xad001001 15 root 666 1073741824 12
0xad001002 16 root 666 1073741824 12
0xad001003 17 root 666 1073741824 12
This looks correct, we have a 128MB Index arena and 4 * 1GB arenas for our storage. However...
root@72test-1:/# free -m
total used free shared buff/cache available
Mem: 31699 1215 19879 355 10604 29673
Swap: 8191 0 8191
The node is only using 355MB of memory in shared ?!
Answer
The reason it appears like this, is because the Linux OS is "Lazy" allocating the shared memory. If we actually look at what is going on :-
root@72test-1:/# pmap -x $(pgrep -f asd) | egrep "RSS|shmid"
Address Kbytes RSS Dirty Mode Mapping
00007fb5c1c00000 8192 8192 8192 rw-s- [ shmid=0x11 ]
00007fb5c2400000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c2c00000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c3400000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c3c00000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c4400000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c4c00000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c5400000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c5c00000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c6400000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c6c00000 8192 8192 8192 r--s- [ shmid=0x11 ]
00007fb5c7400000 958464 0 0 rw-s- [ shmid=0x11 ]
00007fb601c00000 8192 8192 8192 rw-s- [ shmid=0x10 ]
00007fb602400000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb602c00000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb603400000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb603c00000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb604400000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb604c00000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb605400000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb605c00000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb606400000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb606c00000 8192 8192 8192 r--s- [ shmid=0x10 ]
00007fb607400000 958464 0 0 rw-s- [ shmid=0x10 ]
00007fb641c00000 8192 8192 8192 rw-s- [ shmid=0xf ]
00007fb642400000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb642c00000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb643400000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb643c00000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb644400000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb644c00000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb645400000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb645c00000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb646400000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb646c00000 8192 8192 8192 r--s- [ shmid=0xf ]
00007fb647400000 958464 0 0 rw-s- [ shmid=0xf ]
00007fb681e00000 8192 8192 8192 rw-s- [ shmid=0xe ]
00007fb682600000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb682e00000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb683600000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb683e00000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb684600000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb684e00000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb685600000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb685e00000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb686600000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb686e00000 8192 8192 8192 r--s- [ shmid=0xe ]
00007fb687600000 958464 0 0 rw-s- [ shmid=0xe ]
00007fb6c1e00000 131072 4 4 rw-s- [ shmid=0xd ]
Taking shared segment 15 (0xf) as an example, we have 11 x 8MB allocations (88MB), the final allocation 958464 shows an RSS size of 0 so is not actually counting towards the "shared" portion of our free.
If we take our 88MB * 4 disks (352MB) + 4K from our index arena, that is just 3MB short of our 355MB listed in the free output.
You'll also see that 958464 + our 88MB = the 1073741824 bytes shown in our ipcs
*Using the above pmap command shows the mappings of the shared memory that Aerospike is using. The free command is what tells you the actual shared memory allocation.
On top of this, if you were to use asmt to backup and then restore shared memory, this is the result :-
total used free shared buff/cache available
Mem: 15226072 403252 14324344 17380 498476 14556600
Swap: 0 0 0
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0xae001100 2 root 666 1073741824 1
0xa2001100 3 root 666 1073741824 1
Address Kbytes RSS Dirty Mode Mapping
00007f2f91200000 1048576 16384 16384 rw-s- [ shmid=0x3 ]
00007f2fd1200000 1048576 64 64 rw-s- [ shmid=0x2 ]
We can see in this case we have ~17MB of allocated shared memory, this is in both primary index and secondary index - both of which are 1GB Arenas.
Here is our backup :-
total 2104360
drwxr-xr-x. 2 root root 106 Jun 26 08:43 .
drwx------. 1 root root 4096 Jun 26 08:46 ..
-rw-rw-rw-. 1 root root 16720 Jun 26 08:43 a2001000.dat
-rw-rw-rw-. 1 root root 1073741824 Jun 26 08:43 a2001100.dat
-rw-rw-rw-. 1 root root 2097152 Jun 26 08:43 ae001000.dat
-rw-rw-rw-. 1 root root 5259264 Jun 26 08:43 ae001001.dat
-rw-rw-rw-. 1 root root 1073741824 Jun 26 08:43 ae001100.dat
Note that although we had 17MB of shared memory, asmt has backed up the full 2GB of shared memory, these files are padded with zeros.
0175020 0230 adef 56ac afd0 0000 0000 0000 0000
0175040 0000 1001 0000 0000 5f93 c42e 0471 8300
0175060 0813 a000 0000 0000 0000 0000 0000 0000
0175100 0000 0000 0000 0000 0000 0000 0000 0000
*
10000000000
If we then restore those files we now start to occupy the whole 2GB of shared memory as it imports all those zeros.
total used free shared buff/cache available
Mem: 15226072 323340 10200664 2105284 4702068 12535324
Swap: 0 0 0Notes
There is a danger here, you can overcommit on shared memory :-
root@72test-1:/# ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0xad001000 14 root 666 1073741824 12
0xad001001 15 root 666 1073741824 12
0xad001002 16 root 666 1073741824 12
0xad001003 17 root 666 1073741824 12
0xae001100 30 root 666 134217728 1
0xae002100 31 root 666 134217728 1
0xad002000 32 root 666 32212254720 1
root@72test-1:/# free -h
total used free shared buff/cache available
Mem: 30Gi 1.4Gi 18Gi 363Mi 10Gi 28Gi
Swap: 8.0Gi 0B 8.0Gi
This server has 30GB or Ram, I've got 4 X 1GB and a 30GB shared memory space along with the index arena (34GB+) yet I'm only using 363MB of the ram. If I filled up the space I have allocated, Aerospike would OOM crash.
Equally you may have many namespaces that have 1GB arenas for Primary and Secondary index's. Aerospike will let you start up because although you have asked for the memory and although it is notionally allocated it will not be until you try and use it that the overcommitted memory will be realised.