summaryrefslogtreecommitdiffstats
path: root/security/selinux/ss (follow)
Commit message (Collapse)AuthorAgeFilesLines
* selinux: conditionally reschedule in hashtab_insert while loading selinux policyDave Jones2014-06-031-0/+3
| | | | | | | | | After silencing the sleeping warning in mls_convert_context() I started seeing similar traces from hashtab_insert. Do a cond_resched there too. Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
* selinux: conditionally reschedule in mls_convert_context while loading ↵Dave Jones2014-06-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | selinux policy On a slow machine (with debugging enabled), upgrading selinux policy may take a considerable amount of time. Long enough that the softlockup detector gets triggered. The backtrace looks like this.. > BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045] > Call Trace: > [<ffffffff81221ddf>] symcmp+0xf/0x20 > [<ffffffff81221c27>] hashtab_search+0x47/0x80 > [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0 > [<ffffffff812294e8>] convert_context+0x378/0x460 > [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240 > [<ffffffff812221b5>] sidtab_map+0x45/0x80 > [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50 > [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe > [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 > [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0 > [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe > [<ffffffff8121e947>] sel_write_load+0xa7/0x770 > [<ffffffff81139633>] ? vfs_write+0x1c3/0x200 > [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0 > [<ffffffff8113952b>] vfs_write+0xbb/0x200 > [<ffffffff811581c7>] ? fget_light+0x397/0x4b0 > [<ffffffff81139c27>] SyS_write+0x47/0xa0 > [<ffffffff8153bde4>] tracesys+0xdd/0xe2 Stephen Smalley suggested: > Maybe put a cond_resched() within the ebitmap_for_each_positive_bit() > loop in mls_convert_context()? That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted. Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
* selinux: add gfp argument to security_xfrm_policy_alloc and fix callersNikolay Aleksandrov2014-03-101-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | security_xfrm_policy_alloc can be called in atomic context so the allocation should be done with GFP_ATOMIC. Add an argument to let the callers choose the appropriate way. In order to do so a gfp argument needs to be added to the method xfrm_policy_alloc_security in struct security_operations and to the internal function selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic callers and leave GFP_KERNEL as before for the rest. The path that needed the gfp argument addition is: security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security -> all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) -> selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only) Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also add it to security_context_to_sid which is used inside and prior to this patch did only GFP_KERNEL allocation. So add gfp argument to security_context_to_sid and adjust all of its callers as well. CC: Paul Moore <paul@paul-moore.com> CC: Dave Jones <davej@redhat.com> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Fan Du <fan.du@windriver.com> CC: David S. Miller <davem@davemloft.net> CC: LSM list <linux-security-module@vger.kernel.org> CC: SELinux list <selinux@tycho.nsa.gov> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* Merge branch 'stable-3.14' of git://git.infradead.org/users/pcmoore/selinux ↵James Morris2014-02-241-4/+4
|\ | | | | | | into for-linus
| * SELinux: bigendian problems with filename trans rulesEric Paris2014-02-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When writing policy via /sys/fs/selinux/policy I wrote the type and class of filename trans rules in CPU endian instead of little endian. On x86_64 this works just fine, but it means that on big endian arch's like ppc64 and s390 userspace reads the policy and converts it from le32_to_cpu. So the values are all screwed up. Write the values in le format like it should have been to start. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com>
* | Merge branch 'stable-3.14' of git://git.infradead.org/users/pcmoore/selinux ↵James Morris2014-02-101-0/+4
|\| | | | | | | into for-linus
| * SELinux: Fix kernel BUG on empty security contexts.Stephen Smalley2014-02-051-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec #1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a [ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b [ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179 [ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4 [ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31 [ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22 [ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d [ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91 [ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b [ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30 [ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP <ffff8805c0ac3c38> [ 475.328734] ---[ end trace f076482e9d754adc ]--- Reported-by: Matthew Thode <mthode@mthode.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com>
* | Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2014-01-241-8/+4
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit update from Eric Paris: "Again we stayed pretty well contained inside the audit system. Venturing out was fixing a couple of function prototypes which were inconsistent (didn't hurt anything, but we used the same value as an int, uint, u32, and I think even a long in a couple of places). We also made a couple of minor changes to when a couple of LSMs called the audit system. We hoped to add aarch64 audit support this go round, but it wasn't ready. I'm disappearing on vacation on Thursday. I should have internet access, but it'll be spotty. If anything goes wrong please be sure to cc rgb@redhat.com. He'll make fixing things his top priority" * git://git.infradead.org/users/eparis/audit: (50 commits) audit: whitespace fix in kernel-parameters.txt audit: fix location of __net_initdata for audit_net_ops audit: remove pr_info for every network namespace audit: Modify a set of system calls in audit class definitions audit: Convert int limit uses to u32 audit: Use more current logging style audit: Use hex_byte_pack_upper audit: correct a type mismatch in audit_syscall_exit() audit: reorder AUDIT_TTY_SET arguments audit: rework AUDIT_TTY_SET to only grab spin_lock once audit: remove needless switch in AUDIT_SET audit: use define's for audit version audit: documentation of audit= kernel parameter audit: wait_for_auditd rework for readability audit: update MAINTAINERS audit: log task info on feature change audit: fix incorrect set of audit_sock audit: print error message when fail to create audit socket audit: fix dangling keywords in audit_log_set_loginuid() output audit: log on errors from filter user rules ...
| * selinux: call WARN_ONCE() instead of calling audit_log_start()Richard Guy Briggs2014-01-141-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | Two of the conditions in selinux_audit_rule_match() should never happen and the third indicates a race that should be retried. Remove the calls to audit_log() (which call audit_log_start()) and deal with the errors in the caller, logging only once if the condition is met. Calling audit_log_start() in this location makes buffer allocation and locking more complicated in the calling tree (audit_filter_user()). Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | SELinux: Fix memory leak upon loading policyTetsuo Handa2014-01-071-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hello. I got below leak with linux-3.10.0-54.0.1.el7.x86_64 . [ 681.903890] kmemleak: 5538 new suspected memory leaks (see /sys/kernel/debug/kmemleak) Below is a patch, but I don't know whether we need special handing for undoing ebitmap_set_bit() call. ---------- >>From fe97527a90fe95e2239dfbaa7558f0ed559c0992 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Mon, 6 Jan 2014 16:30:21 +0900 Subject: [PATCH] SELinux: Fix memory leak upon loading policy Commit 2463c26d "SELinux: put name based create rules in a hashtable" did not check return value from hashtab_insert() in filename_trans_read(). It leaks memory if hashtab_insert() returns error. unreferenced object 0xffff88005c9160d0 (size 8): comm "systemd", pid 1, jiffies 4294688674 (age 235.265s) hex dump (first 8 bytes): 57 0b 00 00 6b 6b 6b a5 W...kkk. backtrace: [<ffffffff816604ae>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811cba5e>] kmem_cache_alloc_trace+0x12e/0x360 [<ffffffff812aec5d>] policydb_read+0xd1d/0xf70 [<ffffffff812b345c>] security_load_policy+0x6c/0x500 [<ffffffff812a623c>] sel_write_load+0xac/0x750 [<ffffffff811eb680>] vfs_write+0xc0/0x1f0 [<ffffffff811ec08c>] SyS_write+0x4c/0xa0 [<ffffffff81690419>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff However, we should not return EEXIST error to the caller, or the systemd will show below message and the boot sequence freezes. systemd[1]: Failed to load SELinux policy. Freezing. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Eric Paris <eparis@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com>
* | selinux: revert 102aefdda4d8275ce7d7100bc16c88c74272b260Paul Moore2013-12-131-38/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert "selinux: consider filesystem subtype in policies" This reverts commit 102aefdda4d8275ce7d7100bc16c88c74272b260. Explanation from Eric Paris: SELinux policy can specify if it should use a filesystem's xattrs or not. In current policy we have a specification that fuse should not use xattrs but fuse.glusterfs should use xattrs. This patch has a bug in which non-glusterfs filesystems would match the rule saying fuse.glusterfs should use xattrs. If both fuse and the particular filesystem in question are not written to handle xattr calls during the mount command, they will deadlock. I have fixed the bug to do proper matching, however I believe a revert is still the correct solution. The reason I believe that is because the code still does not work. The s_subtype is not set until after the SELinux hook which attempts to match on the ".gluster" portion of the rule. So we cannot match on the rule in question. The code is useless. Signed-off-by: Paul Moore <pmoore@redhat.com>
* | SELinux: security_load_policy: Silence frame-larger-than warningTim Gardner2013-11-191-22/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dynamically allocate a couple of the larger stack variables in order to reduce the stack footprint below 1024. gcc-4.8 security/selinux/ss/services.c: In function 'security_load_policy': security/selinux/ss/services.c:1964:1: warning: the frame size of 1104 bytes is larger than 1024 bytes [-Wframe-larger-than=] } Also silence a couple of checkpatch warnings at the same time. WARNING: sizeof policydb should be sizeof(policydb) + memcpy(oldpolicydb, &policydb, sizeof policydb); WARNING: sizeof policydb should be sizeof(policydb) + memcpy(&policydb, newpolicydb, sizeof policydb); Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <james.l.morris@oracle.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
* | SELinux: Update policy version to support constraints infoRichard Haines2013-11-193-9/+99
| | | | | | | | | | | | | | | | | | Update the policy version (POLICYDB_VERSION_CONSTRAINT_NAMES) to allow holding of policy source info for constraints. Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
* | Merge git://git.infradead.org/users/eparis/selinuxPaul Moore2013-09-186-38/+85
|\ \ | |/ |/| | | | | | | | | | | | | | | | | Conflicts: security/selinux/hooks.c Pull Eric's existing SELinux tree as there are a number of patches in there that are not yet upstream. There was some minor fixup needed to resolve a conflict in security/selinux/hooks.c:selinux_set_mnt_opts() between the labeled NFS patches and Eric's security_fs_use() simplification patch.
| * selinux: consider filesystem subtype in policiesAnand Avati2013-08-281-4/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not considering sub filesystem has the following limitation. Support for SELinux in FUSE is dependent on the particular userspace filesystem, which is identified by the subtype. For e.g, GlusterFS, a FUSE based filesystem supports SELinux (by mounting and processing FUSE requests in different threads, avoiding the mount time deadlock), whereas other FUSE based filesystems (identified by a different subtype) have the mount time deadlock. By considering the subtype of the filesytem in the SELinux policies, allows us to specify a filesystem subtype, in the following way: fs_use_xattr fuse.glusterfs gen_context(system_u:object_r:fs_t,s0); This way not all FUSE filesystems are put in the same bucket and subjected to the limitations of the other subtypes. Signed-off-by: Anand Avati <avati@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * Add SELinux policy capability for always checking packet and peer classes.Chris PeBenito2013-07-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the packet class in SELinux is not checked if there are no SECMARK rules in the security or mangle netfilter tables. Some systems prefer that packets are always checked, for example, to protect the system should the netfilter rules fail to load or if the nefilter rules were maliciously flushed. Add the always_check_network policy capability which, when enabled, treats SECMARK as enabled, even if there are no netfilter SECMARK rules and treats peer labeling as enabled, even if there is no Netlabel or labeled IPSEC configuration. Includes definition of "redhat1" SELinux policy capability, which exists in the SELinux userpace library, to keep ordering correct. The SELinux userpace portion of this was merged last year, but this kernel change fell on the floor. Signed-off-by: Chris PeBenito <cpebenito@tresys.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * SELinux: pass a superblock to security_fs_useEric Paris2013-07-251-12/+9
| | | | | | | | | | | | | | | | | | Rather than passing pointers to memory locations, strings, and other stuff just give up on the separation and give security_fs_use the superblock. It just makes the code easier to read (even if not easier to reuse on some other OS) Signed-off-by: Eric Paris <eparis@redhat.com>
| * SELinux: change sbsec->behavior to shortEric Paris2013-07-251-1/+1
| | | | | | | | | | | | | | | | We only have 6 options, so char is good enough, but use a short as that packs nicely. This shrinks the superblock_security_struct just a little bit. Signed-off-by: Eric Paris <eparis@redhat.com>
| * SELinux: fix selinuxfs policy file on big endian systemsEric Paris2013-07-251-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The /sys/fs/selinux/policy file is not valid on big endian systems like ppc64 or s390. Let's see why: static int hashtab_cnt(void *key, void *data, void *ptr) { int *cnt = ptr; *cnt = *cnt + 1; return 0; } static int range_write(struct policydb *p, void *fp) { size_t nel; [...] /* count the number of entries in the hashtab */ nel = 0; rc = hashtab_map(p->range_tr, hashtab_cnt, &nel); if (rc) return rc; buf[0] = cpu_to_le32(nel); rc = put_entry(buf, sizeof(u32), 1, fp); So size_t is 64 bits. But then we pass a pointer to it as we do to hashtab_cnt. hashtab_cnt thinks it is a 32 bit int and only deals with the first 4 bytes. On x86_64 which is little endian, those first 4 bytes and the least significant, so this works out fine. On ppc64/s390 those first 4 bytes of memory are the high order bits. So at the end of the call to hashtab_map nel has a HUGE number. But the least significant 32 bits are all 0's. We then pass that 64 bit number to cpu_to_le32() which happily truncates it to a 32 bit number and does endian swapping. But the low 32 bits are all 0's. So no matter how many entries are in the hashtab, big endian systems always say there are 0 entries because I screwed up the counting. The fix is easy. Use a 32 bit int, as the hashtab_cnt expects, for nel. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
| * SELinux: Increase ebitmap_node size for 64-bit configurationWaiman Long2013-07-251-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the ebitmap_node structure has a fixed size of 32 bytes. On a 32-bit system, the overhead is 8 bytes, leaving 24 bytes for being used as bitmaps. The overhead ratio is 1/4. On a 64-bit system, the overhead is 16 bytes. Therefore, only 16 bytes are left for bitmap purpose and the overhead ratio is 1/2. With a 3.8.2 kernel, a boot-up operation will cause the ebitmap_get_bit() function to be called about 9 million times. The average number of ebitmap_node traversal is about 3.7. This patch increases the size of the ebitmap_node structure to 64 bytes for 64-bit system to keep the overhead ratio at 1/4. This may also improve performance a little bit by making node to node traversal less frequent (< 2) as more bits are available in each node. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * SELinux: Reduce overhead of mls_level_isvalid() function callWaiman Long2013-07-254-19/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While running the high_systime workload of the AIM7 benchmark on a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel (with HT on), it was found that a pretty sizable amount of time was spent in the SELinux code. Below was the perf trace of the "perf record -a -s" of a test run at 1500 users: 5.04% ls [kernel.kallsyms] [k] ebitmap_get_bit 1.96% ls [kernel.kallsyms] [k] mls_level_isvalid 1.95% ls [kernel.kallsyms] [k] find_next_bit The ebitmap_get_bit() was the hottest function in the perf-report output. Both the ebitmap_get_bit() and find_next_bit() functions were, in fact, called by mls_level_isvalid(). As a result, the mls_level_isvalid() call consumed 8.95% of the total CPU time of all the 24 virtual CPUs which is quite a lot. The majority of the mls_level_isvalid() function invocations come from the socket creation system call. Looking at the mls_level_isvalid() function, it is checking to see if all the bits set in one of the ebitmap structure are also set in another one as well as the highest set bit is no bigger than the one specified by the given policydb data structure. It is doing it in a bit-by-bit manner. So if the ebitmap structure has many bits set, the iteration loop will be done many times. The current code can be rewritten to use a similar algorithm as the ebitmap_contains() function with an additional check for the highest set bit. The ebitmap_contains() function was extended to cover an optional additional check for the highest set bit, and the mls_level_isvalid() function was modified to call ebitmap_contains(). With that change, the perf trace showed that the used CPU time drop down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the total which is about 100X less than before. 0.07% ls [kernel.kallsyms] [k] ebitmap_contains 0.05% ls [kernel.kallsyms] [k] ebitmap_get_bit 0.01% ls [kernel.kallsyms] [k] mls_level_isvalid 0.01% ls [kernel.kallsyms] [k] find_next_bit The remaining ebitmap_get_bit() and find_next_bit() functions calls are made by other kernel routines as the new mls_level_isvalid() function will not call them anymore. This patch also improves the high_systime AIM7 benchmark result, though the improvement is not as impressive as is suggested by the reduction in CPU time spent in the ebitmap functions. The table below shows the performance change on the 2-socket x86-64 system (with HT on) mentioned above. +--------------+---------------+----------------+-----------------+ | Workload | mean % change | mean % change | mean % change | | | 10-100 users | 200-1000 users | 1100-2000 users | +--------------+---------------+----------------+-----------------+ | high_systime | +0.1% | +0.9% | +2.6% | +--------------+---------------+----------------+-----------------+ Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | SELinux: Add new labeling type native labelsDavid Quigley2013-06-081-1/+4
|/ | | | | | | | | | | | | | | | | There currently doesn't exist a labeling type that is adequate for use with labeled NFS. Since NFS doesn't really support xattrs we can't use the use xattr labeling behavior. For this we developed a new labeling type. The native labeling type is used solely by NFS to ensure NFS inodes are labeled at runtime by the NFS code instead of relying on the SELinux security server on the client end. Acked-by: Eric Paris <eparis@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* userns: Convert selinux to use kuid and kgid where appropriateEric W. Biederman2012-09-211-1/+1
| | | | | | | Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <james.l.morris@oracle.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* SELinux: avc: remove the useless fields in avc_add_callbackWanlong Gao2012-04-091-4/+2
| | | | | | | | | avc_add_callback now just used for registering reset functions in initcalls, and the callback functions just did reset operations. So, reducing the arguments to only one event is enough now. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* SELinux: possible NULL deref in context_struct_to_stringEric Paris2012-04-091-3/+5
| | | | | | | | It's possible that the caller passed a NULL for scontext. However if this is a defered mapping we might still attempt to call *scontext=kstrdup(). This is bad. Instead just return the len. Signed-off-by: Eric Paris <eparis@redhat.com>
* SELinux: add default_type statementsEric Paris2012-04-093-5/+31
| | | | | | | | Because Fedora shipped userspace based on my development tree we now have policy version 27 in the wild defining only default user, role, and range. Thus to add default_type we need a policy.28. Signed-off-by: Eric Paris <eparis@redhat.com>
* SELinux: allow default source/target selectors for user/role/rangeEric Paris2012-04-095-7/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When new objects are created we have great and flexible rules to determine the type of the new object. We aren't quite as flexible or mature when it comes to determining the user, role, and range. This patch adds a new ability to specify the place a new objects user, role, and range should come from. For users and roles it can come from either the source or the target of the operation. aka for files the user can either come from the source (the running process and todays default) or it can come from the target (aka the parent directory of the new file) examples always are done with directory context: system_u:object_r:mnt_t:s0-s0:c0.c512 process context: unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 [no rule] unconfined_u:object_r:mnt_t:s0 test_none [default user source] unconfined_u:object_r:mnt_t:s0 test_user_source [default user target] system_u:object_r:mnt_t:s0 test_user_target [default role source] unconfined_u:unconfined_r:mnt_t:s0 test_role_source [default role target] unconfined_u:object_r:mnt_t:s0 test_role_target [default range source low] unconfined_u:object_r:mnt_t:s0 test_range_source_low [default range source high] unconfined_u:object_r:mnt_t:s0:c0.c1023 test_range_source_high [default range source low-high] unconfined_u:object_r:mnt_t:s0-s0:c0.c1023 test_range_source_low-high [default range target low] unconfined_u:object_r:mnt_t:s0 test_range_target_low [default range target high] unconfined_u:object_r:mnt_t:s0:c0.c512 test_range_target_high [default range target low-high] unconfined_u:object_r:mnt_t:s0-s0:c0.c512 test_range_target_low-high Signed-off-by: Eric Paris <eparis@redhat.com>
* selinux: Casting (void *) value returned by kmalloc is uselessThomas Meyer2011-12-191-1/+1
| | | | | | | | The semantic patch that makes this change is available in scripts/coccinelle/api/alloc/drop_kmalloc_cast.cocci. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: James Morris <jmorris@namei.org>
* selinux: sparse fix: fix several warnings in the security server codeJames Morris2011-09-103-3/+2
| | | | | | Fix several sparse warnings in the SELinux security server code. Signed-off-by: James Morris <jmorris@namei.org>
* selinux: sparse fix: fix warnings in netlink codeJames Morris2011-09-101-2/+0
| | | | | | Fix sparse warnings in SELinux Netlink code. Signed-off-by: James Morris <jmorris@namei.org>
* selinux: sparse fix: move selinux_complete_initJames Morris2011-09-101-1/+0
| | | | | | Sparse fix: move selinux_complete_init Signed-off-by: James Morris <jmorris@namei.org>
* doc: Update the email address for Paul Moore in various source filesPaul Moore2011-08-025-5/+5
| | | | | | | | | | My @hp.com will no longer be valid starting August 5, 2011 so an update is necessary. My new email address is employer independent so we don't have to worry about doing this again any time soon. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux ↵James Morris2011-06-151-0/+3
|\ | | | | | | into for-linus
| * SELinux: skip file_name_trans_write() when policy downgraded.Roy.Li2011-06-141-0/+3
| | | | | | | | | | | | | | | | When policy version is less than POLICYDB_VERSION_FILENAME_TRANS, skip file_name_trans_write(). Signed-off-by: Roy.Li <rongqing.li@windriver.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * Merge commit 'v2.6.39' into 20110526Eric Paris2011-05-261-2/+2
| |\ | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: lib/flex_array.c security/selinux/avc.c security/selinux/hooks.c security/selinux/ss/policydb.c security/smack/smack_lsm.c
* | | selinux: don't pass in NULL avd to avc_has_perm_noauditLinus Torvalds2011-05-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now security_get_user_sids() will pass in a NULL avd pointer to avc_has_perm_noaudit(), which then forces that function to have a dummy entry for that case and just generally test it. Don't do it. The normal callers all pass a real avd pointer, and this helper function is incredibly hot. So don't make avc_has_perm_noaudit() do conditional stuff that isn't needed for the common case. This also avoids some duplicated stack space. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into ↵James Morris2011-05-243-111/+217
|\| | | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | for-linus Conflicts: lib/flex_array.c security/selinux/avc.c security/selinux/hooks.c security/selinux/ss/policydb.c security/smack/smack_lsm.c Manually resolve conflicts. Signed-off-by: James Morris <jmorris@namei.org>
| * flex_array: flex_array_prealloc takes a number of elements, not an endEric Paris2011-04-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change flex_array_prealloc to take the number of elements for which space should be allocated instead of the last (inclusive) element. Users and documentation are updated accordingly. flex_arrays got introduced before they had users. When folks started using it, they ended up needing a different API than was coded up originally. This swaps over to the API that folks apparently need. Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Paris <eparis@redhat.com> Tested-by: Chris Richards <gizmo@giz-works.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Cc: stable@kernel.org [2.6.38+]
| * SELinux: put name based create rules in a hashtableEric Paris2011-04-283-61/+135
| | | | | | | | | | | | | | | | | | | | To shorten the list we need to run if filename trans rules exist for the type of the given parent directory I put them in a hashtable. Given the policy we are expecting to use in Fedora this takes the worst case list run from about 5,000 entries to 17. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
| * SELinux: generic hashtab entry counterEric Paris2011-04-281-2/+2
| | | | | | | | | | | | | | | | Instead of a hashtab entry counter function only useful for range transition rules make a function generic for any hashtable to use. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
| * SELinux: calculate and print hashtab stats with a generic functionEric Paris2011-04-281-19/+13
| | | | | | | | | | | | | | | | | | We have custom debug functions like rangetr_hash_eval and symtab_hash_eval which do the same thing. Just create a generic function that takes the name of the hash table as an argument instead of having custom functions. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
| * SELinux: skip filename trans rules if ttype does not match parent dirEric Paris2011-04-283-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now we walk to filename trans rule list for every inode that is created. First passes at policy using this facility creates around 5000 filename trans rules. Running a list of 5000 entries every time is a bad idea. This patch adds a new ebitmap to policy which has a bit set for each ttype that has at least 1 filename trans rule. Thus when an inode is created we can quickly determine if any rules exist for this parent directory type and can skip the list if we know there is definitely no relevant entry. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
| * SELinux: rename filename_compute_type argument to *type instead of *conEric Paris2011-04-281-3/+3
| | | | | | | | | | | | | | | | | | filename_compute_type() takes as arguments the numeric value of the type of the subject and target. It does not take a context. Thus the names are misleading. Fix the argument names. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
| * SELinux: fix comment to state filename_compute_type takes an objname not a qstrEric Paris2011-04-281-1/+1
| | | | | | | | | | | | | | filename_compute_type used to take a qstr, but it now takes just a name. Fix the comments to indicate it is an objname, not a qstr. Signed-off-by: Eric Paris <eparis@redhat.com>
| * SELinux: security_read_policy should take a size_t not ssize_tEric Paris2011-04-251-1/+1
| | | | | | | | | | | | | | | | The len should be an size_t but is a ssize_t. Easy enough fix to silence build warnings. We have no need for signed-ness. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
| * SELinux: delete debugging printks from filename_trans rule processingEric Paris2011-04-201-4/+0
| | | | | | | | | | | | | | | | The filename_trans rule processing has some printk(KERN_ERR ) messages which were intended as debug aids in creating the code but weren't removed before it was submitted. Remove them. Signed-off-by: Eric Paris <eparis@redhat.com>
| * Initialize policydb.process_class eariler.Harry Ciao2011-04-071-5/+5
| | | | | | | | | | | | | | | | | | Initialize policydb.process_class once all symtabs read from policy image, so that it could be used to setup the role_trans.tclass field when a lower version policy.X is loaded. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * selinux: Fix regression for XorgStephen Smalley2011-04-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | Commit 6f5317e730505d5cbc851c435a2dfe3d5a21d343 introduced a bug in the handling of userspace object classes that is causing breakage for Xorg when XSELinux is enabled. Fix the bug by changing map_class() to return SECCLASS_NULL when the class cannot be mapped to a kernel object class. Reported-by: "Justin P. Mattock" <justinmattock@gmail.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
| * selinux: add type_transition with name extension support for selinuxfsKohei Kaigai2011-04-011-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The attached patch allows /selinux/create takes optional 4th argument to support TYPE_TRANSITION with name extension for userspace object managers. If 4th argument is not supplied, it shall perform as existing kernel. In fact, the regression test of SE-PostgreSQL works well on the patched kernel. Thanks, Signed-off-by: KaiGai Kohei <kohei.kaigai@eu.nec.com> [manually verify fuzz was not an issue, and it wasn't: eparis] Signed-off-by: Eric Paris <eparis@redhat.com>
| * SELinux: Write class field in role_trans_write.Harry Ciao2011-03-281-2/+9
| | | | | | | | | | | | | | | | | | If kernel policy version is >= 26, then write the class field of the role_trans structure into the binary reprensentation. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Eric Paris <eparis@redhat.com>