Merging with current kernel changes #1

lehmann · 2013-10-25T23:50:40Z

Merging with current kernel changes

3.0 wip

Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. [On Android, this can only be set by root/CAP_MAC_ADMIN processes, and if running SELinux in enforcing mode, only if mac_admin permission is granted in policy. In Android 4.4, this would only be allowed for root/CAP_MAC_ADMIN processes that are also in unconfined domains. In current AOSP master, mac_admin is not allowed for any domains except the recovery console which has a legitimate need for it. The other potential vector is mounting a maliciously crafted filesystem for which SELinux fetches xattrs (e.g. an ext4 filesystem on a SDcard). However, the end result is only a local denial-of-service (DOS) due to kernel BUG. This fix is queued for 3.14.] Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [Evervolv#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec Evervolv#1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a [ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b [ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179 [ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4 [ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31 [ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22 [ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d [ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91 [ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b [ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30 [ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP <ffff8805c0ac3c38> [ 475.328734] ---[ end trace f076482e9d754adc ]--- [sds: commit message edited to note Android implications and to generate a unique Change-Id for gerrit] Change-Id: I4d5389f0cfa72b5f59dada45081fa47e03805413 Reported-by: Matthew Thode <[email protected]> Signed-off-by: Stephen Smalley <[email protected]> Cc: [email protected] Signed-off-by: Paul Moore <[email protected]>

Below Kernel panic is observed due to race condition, where sock_has_perm called in a thread and is trying to access sksec->sid without checking sksec. Just before that, sk->sk_security was set to NULL by selinux_sk_free_security through sk_free in other thread. 31704.949269: <3> IPv4: Attempt to release alive inet socket dd81b200 31704.959049: <1> Unable to handle kernel NULL pointer dereference at \ virtual address 00000000 31704.983562: <1> pgd = c6b74000 31704.985248: <1> [00000000] *pgd=00000000 31704.996591: <0> Internal error: Oops: 5 [Evervolv#1] PREEMPT SMP ARM 31705.001016: <6> Modules linked in: adsprpc [last unloaded: wlan] 31705.006659: <6> CPU: 1 Tainted: G O \ (3.4.0-g837ab9b-00003-g6bcd9c6 Evervolv#1) 31705.014042: <6> PC is at sock_has_perm+0x58/0xd4 31705.018292: <6> LR is at sock_has_perm+0x58/0xd4 31705.022546: <6> pc : [<c0341e8c>] lr : [<c0341e8c>] \ psr: 60000013 31705.022549: <6> sp : dda27f00 ip : 00000000 fp : 5f36fc84 31705.034002: <6> r10: 00004000 r9 : 0000009d r8 : e8c2b700 31705.039211: <6> r7 : dda27f24 r6 : dd81b200 r5 : 00000000 \ r4 : 00000000 31705.045721: <6> r3 : 00000000 r2 : dda27ef8 r1 : 00000000 \ r0 : dda27f54 31705.052232: <6> Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM \ Segment user 31705.059349: <6> Control: 10c5787d Table: 10d7406a DAC: 00000015 . . . . 31705.697816: <6> [<c0341e8c>] (sock_has_perm+0x58/0xd4) from \ [<c033ed10>] (security_socket_getsockopt+0x14/0x1c) 31705.707534: <6> [<c033ed10>] (security_socket_getsockopt+0x14/0x1c) \ from [<c0745c18>] (sys_getsockopt+0x34/0xa8) 31705.717343: <6> [<c0745c18>] (sys_getsockopt+0x34/0xa8) from \ [<c0106140>] (ret_fast_syscall+0x0/0x30) 31705.726193: <0> Code: e59832e8 e5933058 e5939004 ebfac736 (e5953000) 31705.732635: <4> ---[ end trace 22889004dafd87bd ]--- Change-Id: I79c3fb525f35ea2494d53788788cd71a38a32d6b Signed-off-by: Satya Durga Srinivasu Prabhala <[email protected]> Signed-off-by: Osvaldo Banuelos <[email protected]>

Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. [On Android, this can only be set by root/CAP_MAC_ADMIN processes, and if running SELinux in enforcing mode, only if mac_admin permission is granted in policy. In Android 4.4, this would only be allowed for root/CAP_MAC_ADMIN processes that are also in unconfined domains. In current AOSP master, mac_admin is not allowed for any domains except the recovery console which has a legitimate need for it. The other potential vector is mounting a maliciously crafted filesystem for which SELinux fetches xattrs (e.g. an ext4 filesystem on a SDcard). However, the end result is only a local denial-of-service (DOS) due to kernel BUG. This fix is queued for 3.14.] Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [Evervolv#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec Evervolv#1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a [ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b [ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179 [ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4 [ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31 [ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22 [ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d [ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91 [ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b [ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30 [ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP <ffff8805c0ac3c38> [ 475.328734] ---[ end trace f076482e9d754adc ]--- [sds: commit message edited to note Android implications and to generate a unique Change-Id for gerrit] Change-Id: I4d5389f0cfa72b5f59dada45081fa47e03805413 Reported-by: Matthew Thode <[email protected]> Signed-off-by: Stephen Smalley <[email protected]> Cc: [email protected] Signed-off-by: Paul Moore <[email protected]>

Below Kernel panic is observed due to race condition, where sock_has_perm called in a thread and is trying to access sksec->sid without checking sksec. Just before that, sk->sk_security was set to NULL by selinux_sk_free_security through sk_free in other thread. 31704.949269: <3> IPv4: Attempt to release alive inet socket dd81b200 31704.959049: <1> Unable to handle kernel NULL pointer dereference at \ virtual address 00000000 31704.983562: <1> pgd = c6b74000 31704.985248: <1> [00000000] *pgd=00000000 31704.996591: <0> Internal error: Oops: 5 [Evervolv#1] PREEMPT SMP ARM 31705.001016: <6> Modules linked in: adsprpc [last unloaded: wlan] 31705.006659: <6> CPU: 1 Tainted: G O \ (3.4.0-g837ab9b-00003-g6bcd9c6 Evervolv#1) 31705.014042: <6> PC is at sock_has_perm+0x58/0xd4 31705.018292: <6> LR is at sock_has_perm+0x58/0xd4 31705.022546: <6> pc : [<c0341e8c>] lr : [<c0341e8c>] \ psr: 60000013 31705.022549: <6> sp : dda27f00 ip : 00000000 fp : 5f36fc84 31705.034002: <6> r10: 00004000 r9 : 0000009d r8 : e8c2b700 31705.039211: <6> r7 : dda27f24 r6 : dd81b200 r5 : 00000000 \ r4 : 00000000 31705.045721: <6> r3 : 00000000 r2 : dda27ef8 r1 : 00000000 \ r0 : dda27f54 31705.052232: <6> Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM \ Segment user 31705.059349: <6> Control: 10c5787d Table: 10d7406a DAC: 00000015 . . . . 31705.697816: <6> [<c0341e8c>] (sock_has_perm+0x58/0xd4) from \ [<c033ed10>] (security_socket_getsockopt+0x14/0x1c) 31705.707534: <6> [<c033ed10>] (security_socket_getsockopt+0x14/0x1c) \ from [<c0745c18>] (sys_getsockopt+0x34/0xa8) 31705.717343: <6> [<c0745c18>] (sys_getsockopt+0x34/0xa8) from \ [<c0106140>] (ret_fast_syscall+0x0/0x30) 31705.726193: <0> Code: e59832e8 e5933058 e5939004 ebfac736 (e5953000) 31705.732635: <4> ---[ end trace 22889004dafd87bd ]--- Change-Id: I79c3fb525f35ea2494d53788788cd71a38a32d6b Signed-off-by: Satya Durga Srinivasu Prabhala <[email protected]> Signed-off-by: Osvaldo Banuelos <[email protected]>

commit 55fc05b7414274f17795cd0e8a3b1546f3649d5e upstream. At http://dev.laptop.org/ticket/11980 we have determined that the Marvell CaFe SDHCI controller reports bad card presence during resume. It reports that no card is present even when it is. This is a regression -- resume worked back around 2.6.37. Around 400ms after resuming, a "card inserted" interrupt is generated, at which point it starts reporting presence. Work around this hardware oddity by setting the SDHCI_QUIRK_BROKEN_CARD_DETECTION flag. Thanks to Chris Ball for helping with diagnosis. Signed-off-by: Daniel Drake <[email protected]> Signed-off-by: Chris Ball <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> powerpc/ftrace: Fix assembly trampoline register usage commit fd5a42980e1cf327b7240adf5e7b51ea41c23437 upstream. Just like the module loader, ftrace needs to be updated to use r12 instead of r11 with newer gcc's. Signed-off-by: Roger Blofeld <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> powerpc: Add "memory" attribute for mfmsr() commit b416c9a10baae6a177b4f9ee858b8d309542fbef upstream. Add "memory" attribute in inline assembly language as a compiler barrier to make sure 4.6.x GCC don't reorder mfmsr(). Signed-off-by: Tiejun Chen <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> powerpc: Fix wrong divisor in usecs_to_cputime commit 9f5072d4f63f28d30d343573830ac6c85fc0deff upstream. Commit d57af9b (taskstats: use real microsecond granularity for CPU times) renamed msecs_to_cputime to usecs_to_cputime, but failed to update all numbers on the way. This causes nonsensical cpu idle/iowait values to be displayed in /proc/stat (the only user of usecs_to_cputime so far). This also renames __cputime_msec_factor to __cputime_usec_factor, adapting its value and using it directly in cputime_to_usecs instead of doing two multiplications. Signed-off-by: Andreas Schwab <[email protected]> Acked-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: libsas: continue revalidation commit 26f2f199ff150d8876b2641c41e60d1c92d2fb81 upstream. Continue running revalidation until no more broadcast devices are discovered. Fixes cases where re-discovery completes too early in a domain with multiple expanders with pending re-discovery events. Servicing BCNs can get backed up behind error recovery. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: libsas: fix sas_discover_devices return code handling commit b17caa174a7e1fd2e17b26e210d4ee91c4c28b37 upstream. commit 198439e [SCSI] libsas: do not set res = 0 in sas_ex_discover_dev() commit 19252de [SCSI] libsas: fix wide port hotplug issues The above commits seem to have confused the return value of sas_ex_discover_dev which is non-zero on failure and sas_ex_join_wide_port which just indicates short circuiting discovery on already established ports. The result is random discovery failures depending on configuration. Calls to sas_ex_join_wide_port are the source of the trouble as its return value is errantly assigned to 'res'. Convert it to bool and stop returning its result up the stack. Tested-by: Dan Melnic <[email protected]> Reported-by: Dan Melnic <[email protected]> Signed-off-by: Dan Williams <[email protected]> Reviewed-by: Jack Wang <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations) commit 57fc2e335fd3c2f898ee73570dc81426c28dc7b4 upstream. Rapid ata hotplug on a libsas controller results in cases where libsas is waiting indefinitely on eh to perform an ata probe. A race exists between scsi_schedule_eh() and scsi_restart_operations() in the case when scsi_restart_operations() issues i/o to other devices in the sas domain. When this happens the host state transitions from SHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and ->host_busy is non-zero so we put the eh thread to sleep even though ->host_eh_scheduled is active. Before putting the error handler to sleep we need to check if the host_state needs to return to SHOST_RECOVERY for another trip through eh. Since i/o that is released by scsi_restart_operations has been blocked for at least one eh cycle, this implementation allows those i/o's to run before another eh cycle starts to discourage hung task timeouts. Reported-by: Tom Jackson <[email protected]> Tested-by: Tom Jackson <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: fix hot unplug vs async scan race commit 3b661a92e869ebe2358de8f4b3230ad84f7fce51 upstream. The following crash results from cases where the end_device has been removed before scsi_sysfs_add_sdev has had a chance to run. BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6 ... Call Trace: [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3 [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8125e641>] kobject_add_varg+0x41/0x50 [<ffffffff8125e70b>] kobject_add+0x64/0x66 [<ffffffff8131122b>] device_add+0x12d/0x63a [<ffffffff814b65ea>] ? _raw_spin_unlock_irqrestore+0x47/0x56 [<ffffffff8107de15>] ? module_refcount+0x89/0xa0 [<ffffffff8132f348>] scsi_sysfs_add_sdev+0x4e/0x28a [<ffffffff8132dcbb>] do_scan_async+0x9c/0x145 ...teach scsi_sysfs_add_devices() to check for deleted devices() before trying to add them, and teach scsi_remove_target() how to remove targets that have not been added via device_add(). Reported-by: Dariusz Majchrzak <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: Avoid dangling pointer in scsi_requeue_command() commit 940f5d47e2f2e1fa00443921a0abf4822335b54d upstream. When we call scsi_unprep_request() the command associated with the request gets destroyed and therefore drops its reference on the device. If this was the only reference, the device may get released and we end up with a NULL pointer deref when we call blk_requeue_request. Reported-by: Mike Christie <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Mike Christie <[email protected]> Reviewed-by: Tejun Heo <[email protected]> [jejb: enhance commend and add commit log for stable] Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: hda - Add support for Realtek ALC282 commit 4e01ec636e64707d202a1ca21a47bbc6d53085b7 upstream. This codec has a separate dmic path (separate dmic only ADC), and thus it looks mostly like ALC275. BugLink: https://bugs.launchpad.net/bugs/1025377 Tested-by: Ray Chen <[email protected]> Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: OMAP2+: OPP: Fix to ensure check of right oppdef after bad one commit b110547e586eb5825bc1d04aa9147bff83b57672 upstream. Commit 9fa2df6b90786301b175e264f5fa9846aba81a65 (ARM: OMAP2+: OPP: allow OPP enumeration to continue if device is not present) makes the logic: for (i = 0; i < opp_def_size; i++) { <snip> if (!oh || !oh->od) { <snip> continue; } <snip> opp_def++; } In short, the moment we hit a "Bad OPP", we end up looping the list comparing against the bad opp definition pointer for the rest of the iteration count. Instead, increment opp_def in the for loop itself and allow continue to be used in code without much thought so that we check the next set of OPP definition pointers :) Cc: Steve Sakoman <[email protected]> Cc: Tony Lindgren <[email protected]> Signed-off-by: Nishanth Menon <[email protected]> Signed-off-by: Kevin Hilman <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> usb: gadget: Fix g_ether interface link status commit 31bde1ceaa873bcaecd49e829bfabceacc4c512d upstream. A "usb0" interface that has never been connected to a host has an unknown operstate, and therefore the IFF_RUNNING flag is (incorrectly) asserted when queried by ifconfig, ifplugd, etc. This is a result of calling netif_carrier_off() too early in the probe function; it should be called after register_netdev(). Similar problems have been fixed in many other drivers, e.g.: e826eaf (bonding: Call netif_carrier_off after register_netdevice) 0d672e9 (drivers/net: Call netif_carrier_off at the end of the probe) 6a3c869 (cxgb4: fix reported state of interfaces without link) Fix is to move netif_carrier_off() to the end of the function. Signed-off-by: Kevin Cernekee <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> usbdevfs: Correct amount of data copied to user in processcompl_compat commit 2102e06a5f2e414694921f23591f072a5ba7db9f upstream. iso data buffers may have holes in them if some packets were short, so for iso urbs we should always copy the entire buffer, just like the regular processcompl does. Signed-off-by: Hans de Goede <[email protected]> Acked-by: Alan Stern <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> locks: fix checking of fcntl_setlease argument commit 0ec4f431eb56d633da3a55da67d5c4b88886ccc7 upstream. The only checks of the long argument passed to fcntl(fd,F_SETLEASE,.) are done after converting the long to an int. Thus some illegal values may be let through and cause problems in later code. [ They actually *don't* cause problems in mainline, as of Dave Jones's commit 8d657eb3b438 "Remove easily user-triggerable BUG from generic_setlease", but we should fix this anyway. And this patch will be necessary to fix real bugs on earlier kernels. ] Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ftrace: Disable function tracing during suspend/resume and hibernation, again commit 443772d408a25af62498793f6f805ce3c559309a upstream. If function tracing is enabled for some of the low-level suspend/resume functions, it leads to triple fault during resume from suspend, ultimately ending up in a reboot instead of a resume (or a total refusal to come out of suspended state, on some machines). This issue was explained in more detail in commit f42ac38 (ftrace: disable tracing for suspend to ram). However, the changes made by that commit got reverted by commit cbe2f5a (tracing: allow tracing of suspend/resume & hibernation code again). So, unfortunately since things are not yet robust enough to allow tracing of low-level suspend/resume functions, suspend/resume is still broken when ftrace is enabled. So fix this by disabling function tracing during suspend/resume & hibernation. Signed-off-by: Srivatsa S. Bhat <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: kernel/power/suspend.c stable: update references to older 2.6 versions for 3.x commit 2584f5212d97b664be250ad5700a2d0fee31a10d upstream. Also add information on where the respective trees are. Signed-off-by: Paul Gortmaker <[email protected]> Acked-by: Rob Landley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> workqueue: perform cpu down operations from low priority cpu_notifier() commit 6575820221f7a4dd6eadecf7bf83cdd154335eda upstream. Currently, all workqueue cpu hotplug operations run off CPU_PRI_WORKQUEUE which is higher than normal notifiers. This is to ensure that workqueue is up and running while bringing up a CPU before other notifiers try to use workqueue on the CPU. Per-cpu workqueues are supposed to remain working and bound to the CPU for normal CPU_DOWN_PREPARE notifiers. This holds mostly true even with workqueue offlining running with higher priority because workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which runs the per-cpu workqueue without concurrency management without explicitly detaching the existing workers. However, if the trustee needs to create new workers, it creates unbound workers which may wander off to other CPUs while CPU_DOWN_PREPARE notifiers are in progress. Furthermore, if the CPU down is cancelled, the per-CPU workqueue may end up with workers which aren't bound to the CPU. While reliably reproducible with a convoluted artificial test-case involving scheduling and flushing CPU burning work items from CPU down notifiers, this isn't very likely to happen in the wild, and, even when it happens, the effects are likely to be hidden by the following successful CPU down. Fix it by using different priorities for up and down notifiers - high priority for up operations and low priority for down operations. Workqueue cpu hotplug operations will soon go through further cleanup. Signed-off-by: Tejun Heo <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check commit f197ac13f6eeb351b31250b9ab7d0da17434ea36 upstream. In the ac.c, power_supply_register()'s return value is not checked. As a result, the driver's add() ops may return success even though the device failed to initialize. For example, some BIOS may describe two ACADs in the same DSDT. The second ACAD device will fail to register, but ACPI driver's add() ops returns sucessfully. The ACPI device will receive ACPI notification and cause OOPS. https://bugzilla.redhat.com/show_bug.cgi?id=772730 Signed-off-by: Lan Tianyu <[email protected]> Signed-off-by: Len Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Btrfs: call the ordered free operation without any locks held commit e9fbcb42201c862fd6ab45c48ead4f47bb2dea9d upstream. Each ordered operation has a free callback, and this was called with the worker spinlock held. Josef made the free callback also call iput, which we can't do with the spinlock. This drops the spinlock for the free operation and grabs it again before moving through the rest of the list. We'll circle back around to this and find a cleaner way that doesn't bounce the lock around so much. Signed-off-by: Chris Mason <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: Try harder to avoid HW cursor ending on a multiple of 128 columns. commit f60ec4c7df043df81e62891ac45383d012afe0da upstream. This could previously fail if either of the enabled displays was using a horizontal resolution that is a multiple of 128, and only the leftmost column of the cursor was (supposed to be) visible at the right edge of that display. The solution is to move the cursor one pixel to the left in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33183 Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: fix non revealent error message commit 8d1c702aa0b2c4b22b0742b72a1149d91690674b upstream. We want to print link status query failed only if it's an unexepected fail. If we query to see if we need link training it might be because there is nothing connected and thus link status query have the right to fail in that case. To avoid printing failure when it's expected, move the failure message to proper place. Signed-off-by: Jerome Glisse <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: fix hotplug of DP to DVI|HDMI passive adapters (v2) commit 266dcba541a1ef7e5d82d9e67c67fde2910636e8 upstream. No need to retrain the link for passive adapters. v2: agd5f - no passive DP to VGA adapters, update comments - assign radeon_connector_atom_dig after we are sure we have a digital connector as analog connectors have different private data. - get new sink type before checking for retrain. No need to check if it's no longer a DP connection. Signed-off-by: Jerome Glisse <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: on hotplug force link training to happen (v2) commit ca2ccde5e2f24a792caa4cca919fc5c6f65d1887 upstream. To have DP behave like VGA/DVI we need to retrain the link on hotplug. For this to happen we need to force link training to happen by setting connector dpms to off before asking it turning it on again. v2: agd5f - drop the dp_get_link_status() change in atombios_dp.c for now. We still need the dpms OFF change. Signed-off-by: Jerome Glisse <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> nfsd4: our filesystems are normally case sensitive commit 2930d381d22b9c56f40dd4c63a8fa59719ca2c3c upstream. Actually, xfs and jfs can optionally be case insensitive; we'll handle that case in later patches. Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> nfs: skip commit in releasepage if we're freeing memory for fs-related reasons commit 5cf02d09b50b1ee1c2d536c9cf64af5a7d433f56 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 Evervolv#1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] Evervolv#2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f Evervolv#3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 Evervolv#4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] Evervolv#5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] Evervolv#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ext4: don't let i_reserved_meta_blocks go negative commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream. If we hit a condition where we have allocated metadata blocks that were not appropriately reserved, we risk underflow of ei->i_reserved_meta_blocks. In turn, this can throw sbi->s_dirtyclusters_counter significantly out of whack and undermine the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this occurs and set i_allocated_meta_blocks to avoid this problem. This condition is reproduced by xfstests 270 against ext2 with delalloc enabled: Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28 Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost 270 ultimately fails with an inconsistent filesystem and requires an fsck to repair. The cause of the error is an underflow in ext4_da_update_reserve_space() due to an unreserved meta block allocation. Signed-off-by: Brian Foster <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ext4: pass a char * to ext4_count_free() instead of a buffer_head ptr commit f6fb99cadcd44660c68e13f6eab28333653621e6 upstream. Make it possible for ext4_count_free to operate on buffers and not just data in buffer_heads. Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> bnx2: Fix bug in bnx2_free_tx_skbs(). [ Upstream commit c1f5163de417dab01fa9daaf09a74bbb19303f3c ] In rare cases, bnx2x_free_tx_skbs() can unmap the wrong DMA address when it gets to the last entry of the tx ring. We were not using the proper macro to skip the last entry when advancing the tx index. Reported-by: Zongyun Lai <[email protected]> Reviewed-by: Jeffrey Huang <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> sch_sfb: Fix missing NULL check [ Upstream commit 7ac2908e4b2edaec60e9090ddb4d9ceb76c05e7d ] Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=44461 Signed-off-by: Alan Cox <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> sctp: Fix list corruption resulting from freeing an association on a list [ Upstream commit 2eebc1e188e9e45886ee00662519849339884d6d ] A few days ago Dave Jones reported this oops: [22766.294255] general protection fault: 0000 [Evervolv#1] PREEMPT SMP [22766.295376] CPU 0 [22766.295384] Modules linked in: [22766.387137] ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90 ffff880147c03a74 [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000 [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000, [22766.387137] Stack: [22766.387140] ffff880147c03a10 [22766.387140] ffffffffa169f2b6 [22766.387140] ffff88013ed95728 [22766.387143] 0000000000000002 [22766.387143] 0000000000000000 [22766.387143] ffff880003fad062 [22766.387144] ffff88013c120000 [22766.387144] [22766.387145] Call Trace: [22766.387145] <IRQ> [22766.387150] [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0 [sctp] [22766.387154] [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp] [22766.387157] [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp] [22766.387161] [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0 [22766.387163] [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210 [22766.387166] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 [22766.387168] [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0 [22766.387169] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 [22766.387171] [<ffffffff81590a07>] ip_local_deliver+0x47/0x80 [22766.387172] [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680 [22766.387174] [<ffffffff81590c54>] ip_rcv+0x214/0x320 [22766.387176] [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910 [22766.387178] [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910 [22766.387180] [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40 [22766.387182] [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0 [22766.387183] [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440 [22766.387185] [<ffffffff81559280>] napi_skb_finish+0x70/0xa0 [22766.387187] [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130 [22766.387218] [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e] [22766.387242] [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e] [22766.387266] [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e] [22766.387268] [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0 [22766.387270] [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130 [22766.387273] [<ffffffff810734d0>] __do_softirq+0xe0/0x420 [22766.387275] [<ffffffff8169826c>] call_softirq+0x1c/0x30 [22766.387278] [<ffffffff8101db15>] do_softirq+0xd5/0x110 [22766.387279] [<ffffffff81073bc5>] irq_exit+0xd5/0xe0 [22766.387281] [<ffffffff81698b03>] do_IRQ+0x63/0xd0 [22766.387283] [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f [22766.387283] <EOI> [22766.387284] [22766.387285] [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00 48 89 fb 49 89 f5 66 c1 c0 08 66 39 46 02 [22766.387307] [22766.387307] RIP [22766.387311] [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp] [22766.387311] RSP <ffff880147c039b0> [22766.387142] ffffffffa16ab120 [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]--- [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt It appears from his analysis and some staring at the code that this is likely occuring because an association is getting freed while still on the sctp_assoc_hashtable. As a result, we get a gpf when traversing the hashtable while a freed node corrupts part of the list. Nominally I would think that an mibalanced refcount was responsible for this, but I can't seem to find any obvious imbalance. What I did note however was that the two places where we create an association using sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths which free a newly created association after calling sctp_primitive_ASSOCIATE. sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to the aforementioned hash table. the sctp command interpreter that process side effects has not way to unwind previously processed commands, so freeing the association from the __sctp_connect or sctp_sendmsg error path would lead to a freed association remaining on this hash table. I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init, which allows us to proerly use hlist_unhashed to check if the node is on a hashlist safely during a delete. That in turn alows us to safely call sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths before freeing them, regardles of what the associations state is on the hash list. I noted, while I was doing this, that the __sctp_unhash_endpoint was using hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes pointers to make that function work properly, so I fixed that up in a simmilar fashion. I attempted to test this using a virtual guest running the SCTP_RR test from netperf in a loop while running the trinity fuzzer, both in a loop. I wasn't able to recreate the problem prior to this fix, nor was I able to trigger the failure after (neither of which I suppose is suprising). Given the trace above however, I think its likely that this is what we hit. Signed-off-by: Neil Horman <[email protected]> Reported-by: [email protected] CC: [email protected] CC: "David S. Miller" <[email protected]> CC: Vlad Yasevich <[email protected]> CC: Sridhar Samudrala <[email protected]> CC: [email protected] Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> caif: Fix access to freed pernet memory [ Upstream commit 96f80d123eff05c3cd4701463786b87952a6c3ac ] unregister_netdevice_notifier() must be called before unregister_pernet_subsys() to avoid accessing already freed pernet memory. This fixes the following oops when doing rmmod: Call Trace: [<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif] [<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100 [<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif] [<ffffffff810e7734>] sys_delete_module+0x1a4/0x300 [<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0 [<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3 [<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f RIP [<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif] Signed-off-by: Sjur Brændeland <[email protected]> Acked-by: "Eric W. Biederman" <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> cipso: don't follow a NULL pointer when setsockopt() is called [ Upstream commit 89d7ae34cdda4195809a5a987f697a517a2a3177 ] As reported by Alan Cox, and verified by Lin Ming, when a user attempts to add a CIPSO option to a socket using the CIPSO_V4_TAG_LOCAL tag the kernel dies a terrible death when it attempts to follow a NULL pointer (the skb argument to cipso_v4_validate() is NULL when called via the setsockopt() syscall). This patch fixes this by first checking to ensure that the skb is non-NULL before using it to find the incoming network interface. In the unlikely case where the skb is NULL and the user attempts to add a CIPSO option with the _TAG_LOCAL tag we return an error as this is not something we want to allow. A simple reproducer, kindly supplied by Lin Ming, although you must have the CIPSO DOI Evervolv#3 configure on the system first or you will be caught early in cipso_v4_validate(): #include <sys/types.h> #include <sys/socket.h> #include <linux/ip.h> #include <linux/in.h> #include <string.h> struct local_tag { char type; char length; char info[4]; }; struct cipso { char type; char length; char doi[4]; struct local_tag local; }; int main(int argc, char **argv) { int sockfd; struct cipso cipso = { .type = IPOPT_CIPSO, .length = sizeof(struct cipso), .local = { .type = 128, .length = sizeof(struct local_tag), }, }; memset(cipso.doi, 0, 4); cipso.doi[3] = 3; sockfd = socket(AF_INET, SOCK_DGRAM, 0); #define SOL_IP 0 setsockopt(sockfd, SOL_IP, IP_OPTIONS, &cipso, sizeof(struct cipso)); return 0; } CC: Lin Ming <[email protected]> Reported-by: Alan Cox <[email protected]> Signed-off-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> caif: fix NULL pointer check [ Upstream commit c66b9b7d365444b433307ebb18734757cb668a02 ] Reported-by: <[email protected]> Resolves-bug: http://bugzilla.kernel.org/show_bug?44441 Signed-off-by: Alan Cox <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> wanmain: comparing array with NULL [ Upstream commit 8b72ff6484fe303e01498b58621810a114f3cf09 ] gcc really should warn about these ! Signed-off-by: Alan Cox <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tcp: Add TCP_USER_TIMEOUT negative value check [ Upstream commit 42493570100b91ef663c4c6f0c0fdab238f9d3c2 ] TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int. But patch "tcp: Add TCP_USER_TIMEOUT socket option"(dca43c7) didn't check the negative values. If a user assign -1 to it, the socket will set successfully and wait for 4294967295 miliseconds. This patch add a negative value check to avoid this issue. Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: kaweth.c: use GFP_ATOMIC under spin_lock [ Upstream commit e4c7f259c5be99dcfc3d98f913590663b0305bf8 ] The problem is that we call this with a spin lock held. The call tree is: kaweth_start_xmit() holds kaweth->device_lock. -> kaweth_async_set_rx_mode() -> kaweth_control() -> kaweth_internal_control_msg() The kaweth_internal_control_msg() function is only called from kaweth_control() which used GFP_ATOMIC for its allocations. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handling [ Upstream commit b1beb681cba5358f62e6187340660ade226a5fcc ] When device flags are set using rtnetlink, IFF_PROMISC and IFF_ALLMULTI flags are handled specially. Function dev_change_flags sets IFF_PROMISC and IFF_ALLMULTI bits in dev->gflags according to the passed value but do_setlink passes a result of rtnl_dev_combine_flags which takes those bits from dev->flags. This can be easily trigerred by doing: tcpdump -i eth0 & ip l s up eth0 ip sets IFF_UP flag in ifi_flags and ifi_change, which is combined with IFF_PROMISC by rtnl_dev_combine_flags, causing __dev_change_flags to set IFF_PROMISC in gflags. Reported-by: Max Matveev <[email protected]> Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tcp: perform DMA to userspace only if there is a task waiting for it [ Upstream commit 59ea33a68a9083ac98515e4861c00e71efdc49a1 ] Back in 2006, commit 1a2449a ("[I/OAT]: TCP recv offload to I/OAT") added support for receive offloading to IOAT dma engine if available. The code in tcp_rcv_established() tries to perform early DMA copy if applicable. It however does so without checking whether the userspace task is actually expecting the data in the buffer. This is not a problem under normal circumstances, but there is a corner case where this doesn't work -- and that's when MSG_TRUNC flag to recvmsg() is used. If the IOAT dma engine is not used, the code properly checks whether there is a valid ucopy.task and the socket is owned by userspace, but misses the check in the dmaengine case. This problem can be observed in real trivially -- for example 'tbench' is a good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they have been already early-copied in tcp_rcv_established() using dma engine. This patch introduces the same check we are performing in the simple iovec copy case to the IOAT case as well. It fixes the indefinite recvmsg(MSG_TRUNC) hangs. Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net/tun: fix ioctl() based info leaks [ Upstream commits a117dacde0288f3ec60b6e5bcedae8fa37ee0dfc and 8bbb181308bc348e02bfdbebdedd4e4ec9d452ce ] The tun module leaks up to 36 bytes of memory by not fully initializing a structure located on the stack that gets copied to user memory by the TUNGETIFF and SIOCGIFHWADDR ioctl()s. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: drivers/net/tun.c USB: echi-dbgp: increase the controller wait time to come out of halt. commit f96a4216e85050c0a9d41a41ecb0ae9d8e39b509 upstream. The default 10 microsecond delay for the controller to come out of halt in dbgp_ehci_startup is too short, so increase it to 1 millisecond. This is based on emperical testing on various USB debug ports on modern machines such as a Lenovo X220i and an Ivybridge development platform that needed to wait ~450-950 microseconds. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Jason Wessel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: snd-usb: fix clock source validity index commit aff252a848ce21b431ba822de3dab9c4c94571cb upstream. uac_clock_source_is_valid() uses the control selector value to access the bmControls bitmap of the clock source unit. This is wrong, as control selector values start from 1, while the bitmap uses all available bits. In other words, "Clock Validity Control" is stored in D3..2, not D5..4 of the clock selector unit's bmControls. Signed-off-by: Daniel Mack <[email protected]> Reported-by: Andreas Koch <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: mpu401: Fix missing initialization of irq field commit bc733d495267a23ef8660220d696c6e549ce30b3 upstream. The irq field of struct snd_mpu401 is supposed to be initialized to -1. Since it's set to zero as of now, a probing error before the irq installation results in a kernel warning "Trying to free already-free IRQ 0". Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=44821 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ASoC: wm8962: Allow VMID time to fully ramp commit 9d40e5582c9c4cfb6977ba2a0ca9c2ed82c56f21 upstream. Required for reliable power up from cold. Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ASoC: wm8994: Ensure there are enough BCLKs for four channels commit b8edf3e5522735c8ce78b81845f7a1a2d4a08626 upstream. Otherwise if someone tries to use all four channels on AIF1 with the device in master mode we won't be able to clock out all the data. Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> m68k: Make sys_atomic_cmpxchg_32 work on classic m68k commit 9e2760d18b3cf179534bbc27692c84879c61b97c upstream. User space access must always go through uaccess accessors, since on classic m68k user space and kernel space are completely separate. Signed-off-by: Andreas Schwab <[email protected]> Tested-by: Thorsten Glaser <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> m68k: Correct the Atari ALLOWINT definition commit c663600584a596b5e66258cc10716fb781a5c2c9 upstream. Booting a 3.2, 3.3, or 3.4-rc4 kernel on an Atari using the `nfeth' ethernet device triggers a WARN_ONCE() in generic irq handling code on the first irq for that device: WARNING: at kernel/irq/handle.c:146 handle_irq_event_percpu+0x134/0x142() irq 3 handler nfeth_interrupt+0x0/0x194 enabled interrupts Modules linked in: Call Trace: [<000299b2>] warn_slowpath_common+0x48/0x6a [<000299c0>] warn_slowpath_common+0x56/0x6a [<00029a4c>] warn_slowpath_fmt+0x2a/0x32 [<0005b34c>] handle_irq_event_percpu+0x134/0x142 [<0005b34c>] handle_irq_event_percpu+0x134/0x142 [<0000a584>] nfeth_interrupt+0x0/0x194 [<001ba0a8>] schedule_preempt_disabled+0x0/0xc [<0005b37a>] handle_irq_event+0x20/0x2c [<0005add4>] generic_handle_irq+0x2c/0x3a [<00002ab6>] do_IRQ+0x20/0x32 [<0000289e>] auto_irqhandler_fixup+0x4/0x6 [<00003144>] cpu_idle+0x22/0x2e [<001b8a78>] printk+0x0/0x18 [<0024d112>] start_kernel+0x37a/0x386 [<0003021d>] __do_proc_dointvec+0xb1/0x366 [<0003021d>] __do_proc_dointvec+0xb1/0x366 [<0024c31e>] _sinittext+0x31e/0x9c0 After invoking the irq's handler the kernel sees !irqs_disabled() and concludes that the handler erroneously enabled interrupts. However, debugging shows that !irqs_disabled() is true even before the handler is invoked, which indicates a problem in the platform code rather than the specific driver. The warning does not occur in 3.1 or older kernels. It turns out that the ALLOWINT definition for Atari is incorrect. The Atari definition of ALLOWINT is ~0x400, the stated purpose of that is to avoid taking HSYNC interrupts. irqs_disabled() returns true if the 3-bit ipl & 4 is non-zero. The nfeth interrupt runs at ipl 3 (it's autovector 3), but 3 & 4 is zero so irqs_disabled() is false, and the warning above is generated. When interrupts are explicitly disabled, ipl is set to 7. When they are enabled, ipl is masked with ALLOWINT. On Atari this will result in ipl = 3, which blocks interrupts at ipl 3 and below. So how come nfeth interrupts at ipl 3 are received at all? That's because ipl is reset to 2 by Atari-specific code in default_idle(), again with the stated purpose of blocking HSYNC interrupts. This discrepancy means that ipl 3 can remain blocked for longer than intended. Both default_idle() and falcon_hblhandler() identify HSYNC with ipl 2, and the "Atari ST/.../F030 Hardware Register Listing" agrees, but ALLOWINT is defined as if HSYNC was ipl 3. [As an experiment I modified default_idle() to reset ipl to 3, and as expected that resulted in all nfeth interrupts being blocked.] The fix is simple: define ALLOWINT as ~0x500 instead. This makes arch_local_irq_enable() consistent with default_idle(), and prevents the !irqs_disabled() problems for ipl 3 interrupts. Tested on Atari running in an Aranym VM. Signed-off-by: Mikael Pettersson <[email protected]> Tested-by: Michael Schmitz <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> futex: Test for pi_mutex on fault in futex_wait_requeue_pi() commit b6070a8d9853eda010a549fa9a09eb8d7269b929 upstream. If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test for pi_mutex != NULL before testing the owner against current and possibly unlocking it. Signed-off-by: Darren Hart <[email protected]> Cc: Dave Jones <[email protected]> Cc: Dan Carpenter <[email protected]> Link: http://lkml.kernel.org/r/dc59890338fc413606f04e5c5b131530734dae3d.1342809673.git.dvhart@linux.intel.com Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> futex: Fix bug in WARN_ON for NULL q.pi_state commit f27071cb7fe3e1d37a9dbe6c0dfc5395cd40fa43 upstream. The WARN_ON in futex_wait_requeue_pi() for a NULL q.pi_state was testing the address (&q.pi_state) of the pointer instead of the value (q.pi_state) of the pointer. Correct it accordingly. Signed-off-by: Darren Hart <[email protected]> Cc: Dave Jones <[email protected]> Link: http://lkml.kernel.org/r/1c85d97f6e5f79ec389a4ead3e367363c74bd09a.1342809673.git.dvhart@linux.intel.com Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi() commit 6f7b0a2a5c0fb03be7c25bd1745baa50582348ef upstream. If uaddr == uaddr2, then we have broken the rule of only requeueing from a non-pi futex to a pi futex with this call. If we attempt this, as the trinity test suite manages to do, we miss early wakeups as q.key is equal to key2 (because they are the same uaddr). We will then attempt to dereference the pi_mutex (which would exist had the futex_q been properly requeued to a pi futex) and trigger a NULL pointer dereference. Signed-off-by: Darren Hart <[email protected]> Cc: Dave Jones <[email protected]> Link: http://lkml.kernel.org/r/ad82bfe7f7d130247fbe2b5b4275654807774227.1342809673.git.dvhart@linux.intel.com Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Linux 3.0.40 power: suspend: respect suspend_console_deferred value Conflicts: kernel/power/suspend.c

x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86' commit 141168c36cdee3ff23d9c7700b0edc47cb65479f and commit 3f806e50981825fa56a7f1938f24c0680816be45 upstream. Several fields in struct cpuinfo_x86 were not defined for the !SMP case, likely to save space. However, those fields still have some meaning for UP, and keeping them allows some #ifdef removal from other files. The additional size of the UP kernel from this change is not significant enough to worry about keeping up the distinction: text data bss dec hex filename 4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before 4737444 506459 972040 6215943 5ed907 vmlinux.o.after for a difference of 276 bytes for an example UP config. If someone wants those 276 bytes back badly then it should be implemented in a cleaner way. Signed-off-by: Kevin Winchester <[email protected]> Cc: Steffen Persvold <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts commit a119365586b0130dfea06457f584953e0ff6481d upstream. The following build error occured during a ia64 build with swap-over-NFS patches applied. net/core/sock.c:274:36: error: initializer element is not constant net/core/sock.c:274:36: error: (near initialization for 'memalloc_socks') net/core/sock.c:274:36: error: initializer element is not constant This is identical to a parisc build error. Fengguang Wu, Mel Gorman and James Bottomley did all the legwork to track the root cause of the problem. This fix and entire commit log is shamelessly copied from them with one extra detail to change a dubious runtime use of ATOMIC_INIT() to atomic_set() in drivers/char/mspec.c Dave Anglin says: > Here is the line in sock.i: > > struct static_key memalloc_socks = ((struct static_key) { .enabled = > ((atomic_t) { (0) }) }); The above line contains two compound literals. It also uses a designated initializer to initialize the field enabled. A compound literal is not a constant expression. The location of the above statement isn't fully clear, but if a compound literal occurs outside the body of a function, the initializer list must consist of constant expressions. Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SUNRPC: return negative value in case rpcbind client creation error commit caea33da898e4e14f0ba58173e3b7689981d2c0b upstream. Without this patch kernel will panic on LockD start, because lockd_up() checks lockd_up_net() result for negative value. From my pow it's better to return negative value from rpcbind routines instead of replacing all such checks like in lockd_up(). Signed-off-by: Stanislav Kinsbursky <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> nilfs2: fix deadlock issue between chcp and thaw ioctls commit 572d8b3945a31bee7c40d21556803e4807fd9141 upstream. An fs-thaw ioctl causes deadlock with a chcp or mkcp -s command: chcp D ffff88013870f3d0 0 1325 1324 0x00000004 ... Call Trace: nilfs_transaction_begin+0x11c/0x1a0 [nilfs2] wake_up_bit+0x20/0x20 copy_from_user+0x18/0x30 [nilfs2] nilfs_ioctl_change_cpmode+0x7d/0xcf [nilfs2] nilfs_ioctl+0x252/0x61a [nilfs2] do_page_fault+0x311/0x34c get_unmapped_area+0x132/0x14e do_vfs_ioctl+0x44b/0x490 __set_task_blocked+0x5a/0x61 vm_mmap_pgoff+0x76/0x87 __set_current_blocked+0x30/0x4a sys_ioctl+0x4b/0x6f system_call_fastpath+0x16/0x1b thaw D ffff88013870d890 0 1352 1351 0x00000004 ... Call Trace: rwsem_down_failed_common+0xdb/0x10f call_rwsem_down_write_failed+0x13/0x20 down_write+0x25/0x27 thaw_super+0x13/0x9e do_vfs_ioctl+0x1f5/0x490 vm_mmap_pgoff+0x76/0x87 sys_ioctl+0x4b/0x6f filp_close+0x64/0x6c system_call_fastpath+0x16/0x1b where the thaw ioctl deadlocked at thaw_super() when called while chcp was waiting at nilfs_transaction_begin() called from nilfs_ioctl_change_cpmode(). This deadlock is 100% reproducible. This is because nilfs_ioctl_change_cpmode() first locks sb->s_umount in read mode and then waits for unfreezing in nilfs_transaction_begin(), whereas thaw_super() locks sb->s_umount in write mode. The locking of sb->s_umount here was intended to make snapshot mounts and the downgrade of snapshots to checkpoints exclusive. This fixes the deadlock issue by replacing the sb->s_umount usage in nilfs_ioctl_change_cpmode() with a dedicated mutex which protects snapshot mounts. Signed-off-by: Ryusuke Konishi <[email protected]> Cc: Fernando Luis Vazquez Cao <[email protected]> Tested-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> pcdp: use early_ioremap/early_iounmap to access pcdp table commit 6c4088ac3a4d82779903433bcd5f048c58fb1aca upstream. efi_setup_pcdp_console() is called during boot to parse the HCDP/PCDP EFI system table and setup an early console for printk output. The routine uses ioremap/iounmap to setup access to the HCDP/PCDP table information. The call to ioremap is happening early in the boot process which leads to a panic on x86_64 systems: panic+0x01ca do_exit+0x043c oops_end+0x00a7 no_context+0x0119 __bad_area_nosemaphore+0x0138 bad_area_nosemaphore+0x000e do_page_fault+0x0321 page_fault+0x0020 reserve_memtype+0x02a1 __ioremap_caller+0x0123 ioremap_nocache+0x0012 efi_setup_pcdp_console+0x002b setup_arch+0x03a9 start_kernel+0x00d4 x86_64_start_reservations+0x012c x86_64_start_kernel+0x00fe This replaces the calls to ioremap/iounmap in efi_setup_pcdp_console() with calls to early_ioremap/early_iounmap which can be called during early boot. This patch was tested on an x86_64 prototype system which uses the HCDP/PCDP table for early console setup. Signed-off-by: Greg Pearson <[email protected]> Acked-by: Khalid Aziz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm: fix wrong argument of migrate_huge_pages() in soft_offline_huge_page() commit dc32f63453f56d07a1073a697dcd843dd3098c09 upstream. Commit a6bc32b89922 ("mm: compaction: introduce sync-light migration for use by compaction") changed the declaration of migrate_pages() and migrate_huge_pages(). But it missed changing the argument of migrate_huge_pages() in soft_offline_huge_page(). In this case, we should call migrate_huge_pages() with MIGRATE_SYNC. Additionally, there is a mismatch between type the of argument and the function declaration for migrate_pages(). Signed-off-by: Joonsoo Kim <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Mel Gorman <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: 7479/1: mm: avoid NULL dereference when flushing gate_vma with VIVT caches commit b74253f78400f9a4b42da84bb1de7540b88ce7c4 upstream. The vivt_flush_cache_{range,page} functions check that the mm_struct of the VMA being flushed has been active on the current CPU before performing the cache maintenance. The gate_vma has a NULL mm_struct pointer and, as such, will cause a kernel fault if we try to flush it with the above operations. This happens during ELF core dumps, which include the gate_vma as it may be useful for debugging purposes. This patch adds checks to the VIVT cache flushing functions so that VMAs with a NULL mm_struct are flushed unconditionally (the vectors page may be dirty if we use it to store the current TLS pointer). Reported-by: Gilles Chanteperdrix <[email protected]> Tested-by: Uros Bizjak <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: 7478/1: errata: extend workaround for erratum #720789 commit 5a783cbc48367cfc7b65afc75430953dfe60098f upstream. Commit cdf357f ("ARM: 6299/1: errata: TLBIASIDIS and TLBIMVAIS operations can broadcast a faulty ASID") replaced by-ASID TLB flushing operations with all-ASID variants to workaround A9 erratum #720789. This patch extends the workaround to include the tlb_range operations, which were overlooked by the original patch. Tested-by: Steve Capper <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: arch/arm/mm/tlb-v7.S mm: mmu_notifier: fix freed page still mapped in secondary MMU commit 3ad3d901bbcfb15a5e4690e55350db0899095a68 upstream. mmu_notifier_release() is called when the process is exiting. It will delete all the mmu notifiers. But at this time the page belonging to the process is still present in page tables and is present on the LRU list, so this race will happen: CPU 0 CPU 1 mmu_notifier_release: try_to_unmap: hlist_del_init_rcu(&mn->hlist); ptep_clear_flush_notify: mmu nofifler not found free page !!!!!! /* * At the point, the page has been * freed, but it is still mapped in * the secondary MMU. */ mn->ops->release(mn, mm); Then the box is not stable and sometimes we can get this bug: [ 738.075923] BUG: Bad page state in process migrate-perf pfn:03bec [ 738.075931] page:ffffea00000efb00 count:0 mapcount:0 mapping: (null) index:0x8076 [ 738.075936] page flags: 0x20000000000014(referenced|dirty) The same issue is present in mmu_notifier_unregister(). We can call ->release before deleting the notifier to ensure the page has been unmapped from the secondary MMU before it is freed. Signed-off-by: Xiao Guangrong <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Andrea Arcangeli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mac80211: cancel mesh path timer commit dd4c9260e7f23f2e951cbfb2726e468c6d30306c upstream. The mesh path timer needs to be canceled when leaving the mesh as otherwise it could fire after the interface has been removed already. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> x86, nops: Missing break resulting in incorrect selection on Intel commit d6250a3f12edb3a86db9598ffeca3de8b4a219e9 upstream. The Intel case falls through into the generic case which then changes the values. For cases like the P6 it doesn't do the right thing so this seems to be a screwup. Signed-off-by: Alan Cox <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: H. Peter Anvin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: Add support for architectural random hooks commit 63d77173266c1791f1553e9e8ccea65dc87c4485 upstream. Add support for architecture-specific hooks into the kernel-directed random number generator interfaces. This patchset does not use the architecture random number generator interfaces for the userspace-directed interfaces (/dev/random and /dev/urandom), thus eliminating the need to distinguish between them based on a pool pointer. Changes in version 3: - Moved the hooks from extract_entropy() to get_random_bytes(). - Changes the hooks to inlines. Signed-off-by: H. Peter Anvin <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Herbert Xu <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> fix typo/thinko in get_random_bytes() commit bd29e568a4cb6465f6e5ec7c1c1f3ae7d99cbec1 upstream. If there is an architecture-specific random number generator we use it to acquire randomness one "long" at a time. We should put these random words into consecutive words in the result buffer - not just overwrite the first word again and again. Signed-off-by: Tony Luck <[email protected]> Acked-by: H. Peter Anvin <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: Use arch_get_random_int instead of cycle counter if avail commit cf833d0b9937874b50ef2867c4e8badfd64948ce upstream. We still don't use rdrand in /dev/random, which just seems stupid. We accept the *cycle*counter* as a random input, but we don't accept rdrand? That's just broken. Sure, people can do things in user space (write to /dev/random, use rdrand in addition to /dev/random themselves etc etc), but that *still* seems to be a particularly stupid reason for saying "we shouldn't bother to try to do better in /dev/random". And even if somebody really doesn't trust rdrand as a source of random bytes, it seems singularly stupid to trust the cycle counter *more*. So I'd suggest the attached patch. I'm not going to even bother arguing that we should add more bits to the entropy estimate, because that's not the point - I don't care if /dev/random fills up slowly or not, I think it's just stupid to not use the bits we can get from rdrand and mix them into the strong randomness pool. Link: http://lkml.kernel.org/r/CA%2B55aFwn59N1=m651QAyTy-1gO1noGbK18zwKDwvwqnravA84A@mail.gmail.com Acked-by: "David S. Miller" <[email protected]> Acked-by: "Theodore Ts'o" <[email protected]> Acked-by: Herbert Xu <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Tony Luck <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: Use arch-specific RNG to initialize the entropy store commit 3e88bdff1c65145f7ba297ccec69c774afe4c785 upstream. If there is an architecture-specific random number generator (such as RDRAND for Intel architectures), use it to initialize /dev/random's entropy stores. Even in the worst case, if RDRAND is something like AES(NSA_KEY, counter++), it won't hurt, and it will definitely help against any other adversaries. Signed-off-by: "Theodore Ts'o" <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: Adjust the number of loops when initializing commit 2dac8e54f988ab58525505d7ef982493374433c3 upstream. When we are initializing using arch_get_random_long() we only need to loop enough times to touch all the bytes in the buffer; using poolwords for that does twice the number of operations necessary on a 64-bit machine, since in the random number generator code "word" means 32 bits. Signed-off-by: H. Peter Anvin <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> drivers/char/random.c: fix boot id uniqueness race commit 44e4360fa3384850d65dd36fb4e6e5f2f112709b upstream. /proc/sys/kernel/random/boot_id can be read concurrently by userspace processes. If two (or more) user-space processes concurrently read boot_id when sysctl_bootid is not yet assigned, a race can occur making boot_id differ between the reads. Because the whole point of the boot id is to be unique across a kernel execution, fix this by protecting this operation with a spinlock. Given that this operation is not frequently used, hitting the spinlock on each call should not be an issue. Signed-off-by: Mathieu Desnoyers <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Matt Mackall <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: make 'add_interrupt_randomness()' do something sane commit 775f4b297b780601e61787b766f306ed3e1d23eb upstream. We've been moving away from add_interrupt_randomness() for various reasons: it's too expensive to do on every interrupt, and flooding the CPU with interrupts could theoretically cause bogus floods of entropy from a somewhat externally controllable source. This solves both problems by limiting the actual randomness addition to just once a second or after 64 interrupts, whicever comes first. During that time, the interrupt cycle data is buffered up in a per-cpu pool. Also, we make sure the the nonblocking pool used by urandom is initialized before we start feeding the normal input pool. This assures that /dev/urandom is returning unpredictable data as soon as possible. (Based on an original patch by Linus, but significantly modified by tytso.) Tested-by: Eric Wustrow <[email protected]> Reported-by: Eric Wustrow <[email protected]> Reported-by: Nadia Heninger <[email protected]> Reported-by: Zakir Durumeric <[email protected]> Reported-by: J. Alex Halderman <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: use lockless techniques in the interrupt path commit 902c098a3663de3fa18639efbb71b6080f0bcd3c upstream. The real-time Linux folks don't like add_interrupt_randomness() taking a spinlock since it is called in the low-level interrupt routine. This also allows us to reduce the overhead in the fast path, for the random driver, which is the interrupt collection path. Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: create add_device_randomness() interface commit a2080a67abe9e314f9e9c2cc3a4a176e8a8f8793 upstream. Add a new interface, add_device_randomness() for adding data to the random pool that is likely to differ between two devices (or possibly even per boot). This would be things like MAC addresses or serial numbers, or the read-out of the RTC. This does *not* add any actual entropy to the pool, but it initializes the pool to different values for devices that might otherwise be identical and have very little entropy available to them (particularly common in the embedded world). [ Modified by tytso to mix in a timestamp, since there may be some variability caused by the time needed to detect/configure the hardware in question. ] Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> usb: feed USB device information to the /dev/random driver commit b04b3156a20d395a7faa8eed98698d1e17a36000 upstream. Send the USB device's serial, product, and manufacturer strings to the /dev/random driver to help seed its pools. Cc: Linus Torvalds <[email protected]> Acked-by: Greg KH <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: drivers/usb/core/hub.c net: feed /dev/random with the MAC address when registering a device commit 7bf2357524408b97fec58344caf7397f8140c3fd upstream. Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: David Miller <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: use the arch-specific rng in xfer_secondary_pool commit e6d4947b12e8ad947add1032dd754803c6004824 upstream. If the CPU supports a hardware random number generator, use it in xfer_secondary_pool(), where it will significantly improve things and where we can afford it. Also, remove the use of the arch-specific rng in add_timer_randomness(), since the call is significantly slower than get_cycles(), and we're much better off using it in xfer_secondary_pool() anyway. Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: add new get_random_bytes_arch() function commit c2557a303ab6712bb6e09447df828c557c710ac9 upstream. Create a new function, get_random_bytes_arch() which will use the architecture-specific hardware random number generator if it is present. Change get_random_bytes() to not use the HW RNG, even if it is avaiable. The reason for this is that the hw random number generator is fast (if it is present), but it requires that we trust the hardware manufacturer to have not put in a back door. (For example, an increasing counter encrypted by an AES key known to the NSA.) It's unlikely that Intel (for example) was paid off by the US Government to do this, but it's impossible for them to prove otherwise --- especially since Bull Mountain is documented to use AES as a whitener. Hence, the output of an evil, trojan-horse version of RDRAND is statistically indistinguishable from an RDRAND implemented to the specifications claimed by Intel. Short of using a tunnelling electronic microscope to reverse engineer an Ivy Bridge chip and disassembling and analyzing the CPU microcode, there's no way for us to tell for sure. Since users of get_random_bytes() in the Linux kernel need to be able to support hardware systems where the HW RNG is not present, most time-sensitive users of this interface have already created their own cryptographic RNG interface which uses get_random_bytes() as a seed. So it's much better to use the HW RNG to improve the existing random number generator, by mixing in any entropy returned by the HW RNG into /dev/random's entropy pool, but to always _use_ /dev/random's entropy pool. This way we get almost of the benefits of the HW RNG without any potential liabilities. The only benefits we forgo is the speed/performance enhancements --- and generic kernel code can't depend on depend on get_random_bytes() having the speed of a HW RNG anyway. For those places that really want access to the arch-specific HW RNG, if it is available, we provide get_random_bytes_arch(). Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: add tracepoints for easier debugging and verification commit 00ce1db1a634746040ace24c09a4e3a7949a3145 upstream. Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> MAINTAINERS: Theodore Ts'o is taking over the random driver commit 330e0a01d54c2b8606c56816f99af6ebc58ec92c upstream. Matt Mackall stepped down as the /dev/random driver maintainer last year, so Theodore Ts'o is taking back the /dev/random driver. Cc: Matt Mackall <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> rtc: wm831x: Feed the write counter into device_add_randomness() commit 9dccf55f4cb011a7552a8a2749a580662f5ed8ed upstream. The tamper evident features of the RTC include the "write counter" which is a pseudo-random number regenerated whenever we set the RTC. Since this value is unpredictable it should provide some useful seeding to the random number generator. Only do this on boot since the goal is to seed the pool rather than add useful entropy. Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mfd: wm831x: Feed the device UUID into device_add_randomness() commit 27130f0cc3ab97560384da437e4621fc4e94f21c upstream. wm831x devices contain a unique ID value. Feed this into the newly added device_add_randomness() to add some per device seed data to the pool. Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: remove rand_initialize_irq() commit c5857ccf293968348e5eb4ebedc68074de3dcda6 upstream. With the new interrupt sampling system, we are no longer using the timer_rand_state structure in the irq descriptor, so we can stop initializing it now. [ Merged in fixes from Sedat to find some last missing references to rand_initialize_irq() ] Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Sedat Dilek <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: kernel/irq/manage.c random: Add comment to random_initialize() commit cbc96b7594b5691d61eba2db8b2ea723645be9ca upstream. Many platforms have per-machine instance data (serial numbers, asset tags, etc.) squirreled away in areas that are accessed during early system bringup. Mixing this data into the random pools has a very high value in providing better random data, so we should allow (and even encourage) architecture code to call add_device_randomness() from the setup_arch() paths. However, this limits our options for internal structure of the random driver since random_initialize() is not called until long after setup_arch(). Add a big fat comment to rand_initialize() spelling out this requirement. Suggested-by: Theodore Ts'o <[email protected]> Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> dmi: Feed DMI table to /dev/random driver commit d114a33387472555188f142ed8e98acdb8181c6d upstream. Send the entire DMI (SMBIOS) table to the /dev/random driver to help seed its pools. Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> random: mix in architectural randomness in extract_buf() commit d2e7c96af1e54b507ae2a6a7dd2baf588417a7e5 upstream. Mix in any architectural randomness in extract_buf() instead of xfer_secondary_buf(). This allows us to mix in more architectural randomness, and it also makes xfer_secondary_buf() faster, moving a tiny bit of additional CPU overhead to process which is extracting the randomness. [ Commit description modified by tytso to remove an extended advertisement for the RDRAND instruction. ] Signed-off-by: H. Peter Anvin <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: DJ Johnston <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> x86, microcode: microcode_core.c simple_strtoul cleanup commit e826abd523913f63eb03b59746ffb16153c53dc4 upstream. Change reload_for_cpu() in kernel/microcode_core.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/1336324264.2897.9.camel@lorien2 Signed-off-by: H. Peter Anvin <[email protected]> Cc: Henrique de Moraes Holschuh <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> x86, microcode: Sanitize per-cpu microcode reloading interface commit c9fc3f778a6a215ace14ee556067c73982b6d40f upstream. Microcode reloading in a per-core manner is a very bad idea for both major x86 vendors. And the thing is, we have such interface with which we can end up with different microcode versions applied on different cores of an otherwise homogeneous wrt (family,model,stepping) system. So turn off the possibility of doing that per core and allow it only system-wide. This is a minimal fix which we'd like to see in stable too thus the more-or-less arbitrary decision to allow system-wide reloading only on the BSP: $ echo 1 > /sys/devices/system/cpu/cpu0/microcode/reload ... and disable the interface on the other cores: $ echo 1 > /sys/devices/system/cpu/cpu23/microcode/reload -bash: echo: write error: Invalid argument Also, allowing the reload only from one CPU (the BSP in that case) doesn't allow the reload procedure to degenerate into an O(n^2) deal when triggering reloads from all /sys/devices/system/cpu/cpuX/microcode/reload sysfs nodes simultaneously. A more generic fix will follow. Signed-off-by: Borislav Petkov <[email protected]> Cc: Henrique de Moraes Holschuh <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables commit d833352a4338dc31295ed832a30c9ccff5c7a183 upstream. If a process creates a large hugetlbfs mapping that is eligible for page table sharing and forks heavily with children some of whom fault and others which destroy the mapping then it is possible for page tables to get corrupted. Some teardowns of the mapping encounter a "bad pmd" and output a message to the kernel log. The final teardown will trigger a BUG_ON in mm/filemap.c. This was reproduced in 3.4 but is known to have existed for a long time and goes back at least as far as 2.6.37. It was probably was introduced in 2.6.20 by [39dde65: shared page table for hugetlb page]. The messages look like this; [ ..........] Lots of bad pmd messages followed by this [ 127.164256] mm/memory.c:391: bad pmd ffff880412e04fe8(80000003de4000e7). [ 127.164257] mm/memory.c:391: bad pmd ffff880412e04ff0(80000003de6000e7). [ 127.164258] mm/memory.c:391: bad pmd ffff880412e04ff8(80000003de0000e7). [ 127.186778] ------------[ cut here ]------------ [ 127.186781] kernel BUG at mm/filemap.c:134! [ 127.186782] invalid opcode: 0000 [Evervolv#1] SMP [ 127.186783] CPU 7 [ 127.186784] Modules linked in: af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd dm_mod coretemp crc32c_intel usb_storage ghash_clmulni_intel aesni_intel i2c_i801 r8169 mii uas sr_mod cdrom sg iTCO_wdt iTCO_vendor_support shpchp serio_raw cryptd aes_x86_64 e1000e pci_hotplug dcdbas aes_generic container microcode ext4 mbcache jbd2 crc16 sd_mod crc_t10dif i915 drm_kms_helper drm i2c_algo_bit ehci_hcd ahci libahci usbcore rtc_cmos usb_common button i2c_core intel_agp video intel_gtt fan processor thermal thermal_sys hwmon ata_generic pata_atiixp libata scsi_mod [ 127.186801] [ 127.186802] Pid: 9017, comm: hugetlbfs-test Not tainted 3.4.0-autobuild #53 Dell Inc. OptiPlex 990/06D7TR [ 127.186804] RIP: 0010:[<ffffffff810ed6ce>] [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160 [ 127.186809] RSP: 0000:ffff8804144b5c08 EFLAGS: 00010002 [ 127.186810] RAX: 0000000000000001 RBX: ffffea000a5c9000 RCX: 00000000ffffffc0 [ 127.186811] RDX: 0000000000000000 RSI: 0000000000000009 RDI: ffff88042dfdad00 [ 127.186812] RBP: ffff8804144b5c18 R08: 0000000000000009 R09: 0000000000000003 [ 127.186813] R10: 0000000000000000 R11: 000000000000002d R12: ffff880412ff83d8 [ 127.186814] R13: ffff880412ff83d8 R14: 0000000000000000 R15: ffff880412ff83d8 [ 127.186815] FS: 00007fe18ed2c700(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000 [ 127.186816] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 127.186817] CR2: 00007fe340000503 CR3: 0000000417a14000 CR4: 00000000000407e0 [ 127.186818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 127.186819] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 127.186820] Process hugetlbfs-test (pid: 9017, threadinfo ffff8804144b4000, task ffff880417f803c0) [ 127.186821] Stack: [ 127.186822] ffffea000a5c9000 0000000000000000 ffff8804144b5c48 ffffffff810ed83b [ 127.186824] ffff8804144b5c48 000000000000138a 0000000000001387 ffff8804144b5c98 [ 127.186825] ffff8804144b5d48 ffffffff811bc925 ffff8804144b5cb8 0000000000000000 [ 127.186827] Call Trace: [ 127.186829] [<ffffffff810ed83b>] delete_from_page_cache+0x3b/0x80 [ 127.186832] [<ffffffff811bc925>] truncate_hugepages+0x115/0x220 [ 127.186834] [<ffffffff811bca43>] hugetlbfs_evict_inode+0x13/0x30 [ 127.186837] [<ffffffff811655c7>] evict+0xa7/0x1b0 [ 127.186839] [<ffffffff811657a3>] iput_final+0xd3/0x1f0 [ 127.186840] [<ffffffff811658f9>] iput+0x39/0x50 [ 127.186842] [<ffffffff81162708>] d_kill+0xf8/0x130 [ 127.186843] [<ffffffff81162812>] dput+0xd2/0x1a0 [ 127.186845] [<ffffffff8114e2d0>] __fput+0x170/0x230 [ 127.186848] [<ffffffff81236e0e>] ? rb_erase+0xce/0x150 [ 127.186849] [<ffffffff8114e3ad>] fput+0x1d/0x30 [ 127.186851] [<ffffffff81117db7>] remove_vma+0x37/0x80 [ 127.186853] [<ffffffff81119182>] do_munmap+0x2d2/0x360 [ 127.186855] [<ffffffff811cc639>] sys_shmdt+0xc9/0x170 [ 127.186857] [<ffffffff81410a39>] system_call_fastpath+0x16/0x1b [ 127.186858] Code: 0f 1f 44 00 00 48 8b 43 08 48 8b 00 48 8b 40 28 8b b0 40 03 00 00 85 f6 0f 88 df fe ff ff 48 89 df e8 e7 cb 05 00 e9 d2 fe ff ff <0f> 0b 55 83 e2 fd 48 89 e5 48 83 ec 30 48 89 5d d8 4c 89 65 e0 [ 127.186868] RIP [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160 [ 127.186870] RSP <ffff8804144b5c08> [ 127.186871] ---[ end trace 7cbac5d1db69f426 ]--- The bug is a race and not always easy to reproduce. To reproduce it I was doing the following on a single socket I7-based machine with 16G of RAM. $ hugeadm --pool-pages-max DEFAULT:13G $ echo $((18*1048576*1024)) > /proc/sys/kernel/shmmax $ echo $((18*1048576*1024)) > /proc/sys/kernel/shmall $ for i in `seq 1 9000`; do ./hugetlbfs-test; done On my particular machine, it usually triggers within 10 minutes but enabling debug options can change the timing such that it never hits. Once the bug is triggered, the machine is in trouble and needs to be rebooted. The machine will respond but processes accessing proc like "ps aux" will hang due to the BUG_ON. shutdown will also hang and needs a hard reset or a sysrq-b. The basic problem is a race between page table sharing and teardown. For the most part page table sharing depends on i_mmap_mutex. In some cases, it is also taking the mm->page_table_lock for the PTE updates but with shared page tables, it is the i_mmap_mutex that is more important. Unfortunately it appears to be also insufficient. Consider the following situation Process A Process B --------- --------- hugetlb_fault shmdt LockWrite(mmap_sem) do_munmap unmap_region unmap_vmas unmap_single_vma unmap_hugepage_range Lock(i_mmap_mutex) Lock(mm->page_table_lock) huge_pmd_unshare/unmap tables <--- (1) Unlock(mm->page_table_lock) Unlock(i_mmap_mutex) huge_pte_alloc ... Lock(i_mmap_mutex) ... vma_prio_walk, find svma, spte ... Lock(mm->page_table_lock) ... share spte ... Unlock(mm->page_table_lock) ... Unlock(i_mmap_mutex) ... hugetlb_no_page <--- (2) free_pgtables unlink_file_vma hugetlb_free_pgd_range remove_vma_list In this scenario, it is possible for Process A to share page tables with Process B that is trying to tear them down. The i_mmap_mutex on its own does not prevent Process A walking Process B's page tables. At (1) above, the page tables are not shared yet so it unmaps the PMDs. Process A sets up page table sharing and at (2) faults a new entry. Process B then trips up on it in free_pgtables. This patch fixes the problem by adding a new function __unmap_hugepage_range_final that is only called when the VMA is about to be destroyed. This function clears VM_MAYSHARE during unmap_hugepage_range() under the i_mmap_mutex. This makes the VMA ineligible for sharing and avoids the race. Superficially this looks like it would then be vunerable to truncate and madvise issues but hugetlbfs has its own truncate handlers so does not use unmap_mapping_range() and does not support madvise(DONTNEED). This should be treated as a -stable candidate if it is merged. Test program is as follows. The test case was mostly written by Michal Hocko with a few minor changes to reproduce this bug. ==== CUT HERE ==== static size_t huge_page_size = (2UL << 20); static size_t nr_huge_page_A = 512; static size_t nr_huge_page_B = 5632; unsigned int get_random(unsigned int max) { struct timeval tv; gettimeofday(&tv, NULL); srandom(tv.tv_usec); return random() % max; } static void play(void *addr, size_t size) { unsigned char *start = addr, *end = start + size, *a; start += get_random(size/2); /* we could itterate on huge pages but let's give it more time. */ for (a = start; a < end; a += 4096) *a = 0; } int main(int argc, char **argv) { key_t key = IPC_PRIVATE; size_t sizeA = nr_huge_page_A * huge_page_size; size_t sizeB = nr_huge_page_B * huge_page_size; int shmidA, shmidB; void *addrA = NULL, *addrB = NULL; int nr_children = 300, n = 0; if ((shmidA = shmget(key, sizeA, IPC_CREAT|SHM_HUGETLB|0660)) == -1) { perror("shmget:"); return 1; } if ((addrA = shmat(shmidA, addrA, SHM_R|SHM_W)) == (void *)-1UL) { perror("shmat"); return 1; } if ((shmidB = shmget(key, sizeB, IPC_CREAT|SHM_HUGETLB|0660)) == -1) { perror("shmget:"); return 1; } if ((addrB = shmat(shmidB, addrB, SHM_R|SHM_W)) == (void *)-1UL) { perror("shmat"); return 1; } fork_child: switch(fork()) { case 0: switch (n%3) { case 0: play(addrA, sizeA); break; case 1: play(addrB, sizeB); break; case 2: break; } break; case -1: perror("fork:"); break; default: if (++n < nr_children) goto fork_child; play(addrA, sizeA); break; } shmdt(addrA); shmdt(addrB); do { wait(NULL); } while (--n > 0); shmctl(shmidA, IPC_RMID, NULL); shmctl(shmidB, IPC_RMID, NULL); return 0; } [[email protected]: name the declaration's args, fix CONFIG_HUGETLBFS=n build] Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfig commit 3bed491c8d28329e34f8a31e3fe64d03f3a350f1 upstream. The CONFIG_DEFAULT_MMAP_MIN_ADDR was set to 65536 in mxs_defconfig, this caused severe breakage of userland applications since the upper limit for ARM is 32768. By default CONFIG_DEFAULT_MMAP_MIN_ADDR is set to 4096 and can also be changed via /proc/sys/vm/mmap_min_addr if needed. Quoting Russell King [1]: "4096 is also fine for ARM too. There's not much point in having defconfigs change it - that would just be pure noise in the config files." the CONFIG_DEFAULT_MMAP_MIN_ADDR can be removed from the defconfig altogether. This problem was introduced by commit cde7c41 (ARM: configs: add defconfig for mach-mxs). [1] http://marc.info/?l=linux-arm-kernel&m=134401593807820&w=2 Signed-off-by: Marek Vasut <[email protected]> Cc: Russell King <[email protected]> Cc: Wolfgang Denk <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: pxa: remove irq_to_gpio from ezx-pcap driver commit 59ee93a528b94ef4e81a08db252b0326feff171f upstream. The irq_to_gpio function was removed from the pxa platform in linux-3.2, and this driver has been broken since. There is actually no in-tree user of this driver that adds this platform device, but the driver can and does get enabled on some platforms. Without this patch, building ezx_defconfig results in: drivers/mfd/ezx-pcap.c: In function 'pcap_isr_work': drivers/mfd/ezx-pcap.c:205:2: error: implicit declaration of function 'irq_to_gpio' [-Werror=implicit-function-declaration] Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Haojian Zhuang <[email protected]> Cc: Samuel Ortiz <[email protected]> Cc: Daniel Ribeiro <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> cfg80211: process pending events when unregistering net device commit 1f6fc43e621167492ed4b7f3b4269c584c3d6ccc upstream. libertas currently calls cfg80211_disconnected() when it is being brought down. This causes an event to be allocated, but since the wdev is already removed from the rdev by the time that the event processing work executes, the event is never processed or freed. http://article.gmane.org/gmane.linux.kernel.wireless.general/95666 Fix this leak, and other possible situations, by processing the event queue when a device is being unregistered. Thanks to Johannes Berg for the suggestion. Signed-off-by: Daniel Drake <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> cfg80211: fix interface combinations check for ADHOC(IBSS) partial of commit 8e8b41f9d8c8e63fc92f899ace8da91a490ac573 upstream. As part of commit 463454b5dbd8 ("cfg80211: fix interface combinations check"), this extra check was introduced: if ((all_iftypes & used_iftypes) != used_iftypes) goto cont; However, most wireless NIC drivers did not advertise ADHOC in wiphy.iface_combinations[i].limits[] and hence we'll get -EBUSY when we bring up a ADHOC wlan with commands similar to: # iwconfig wlan0 mode ad-hoc && ifconfig wlan0 up In commit 8e8b41f9d8c8e ("cfg80211: enforce lack of interface combinations"), the change below fixes the issue: if (total == 1) return 0; But it also introduces other dependencies for stable. For example, a full cherry pick of 8e8b41f9d8c8e would introduce additional regressions unless we also start cherry picking driver specific fixes like the following: 9b4760e ath5k: add possible wiphy interface combinations 1ae2fc2 mac80211_hwsim: advertise interface combinations 20c8e8d ath9k: add possible wiphy interface combinations And the purpose of the 'if (total == 1)' is to cover the specific use case (IBSS, adhoc) that was mentioned above. So we just pick the specific part out from 8e8b41f9d8c8e here. Doing so gives stable kernels a way to fix the change introduced by 463454b5dbd8, without having to make cherry picks specific to various NIC drivers. Signed-off-by: Liang Li <[email protected]> Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> e1000e: NIC goes up and immediately goes down commit b7ec70be01a87f2c85df3ae11046e74f9b67e323 upstream. Found that commit d478eb4 was a bad commit. If the link partner is transmitting codeword (even if NULL codeword), then the RXCW.C bit will be set so check for RXCW.CW is unnecessary. Ref: RH BZ 840642 Reported-by: Fabio Futigami <[email protected]> Signed-off-by: Tushar Dave <[email protected]> CC: Marcelo Ricardo Leitner <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Peter P Waskiewicz Jr <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Input: wacom - Bamboo One 1024 pressure fix commit 6dc463511d4a690f01a9248df3b384db717e0b1c upstream. Bamboo One's with ID of 0x6a and 0x6b were added with correct indication of 1024 pressure levels but the Graphire packet routine was only looking at 9 bits. Increased to 10 bits. This bug caused these devices to roll over to zero pressure at half way mark. The other devices using this routine only support 256 or 512 range and look to fix unused bits at zero. Signed-off-by: Chris Bagwell <[email protected]> Reported-by: Tushant Mirchandani <[email protected]> Reviewed-by: Ping Cheng <[email protected]> Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> rt61pci: fix NULL pointer dereference in config_lna_gain commit deee0214def5d8a32b8112f11d9c2b1696e9c0cb upstream. We can not pass NULL libconf->conf->channel to rt61pci_config() as it is dereferenced unconditionally in rt61pci_config_lna_gain() subroutine. Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=44361 Reported-and-tested-by: <[email protected]> Signed-off-by: Stanislaw Gruszka <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Linux 3.0.41 arm: #endif derp

USB: vt6656: remove __devinit* from the struct usb_device_id table commit 4d088876f24887cd15a29db923f5f37db6a99f21 upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Forest Bond <[email protected]> CC: Marcos Paulo de Souza <[email protected]> CC: "David S. Miller" <[email protected]> CC: Jesper Juhl <[email protected]> CC: Jiri Pirko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: emi62: remove __devinit* from the struct usb_device_id table commit 83957df21dd94655d2b026e0944a69ff37b83988 upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Paul Gortmaker <[email protected]> CC: Andrew Morton <[email protected]> CC: Felipe Balbi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: hda - fix Copyright debug message commit 088c820b732dbfd515fc66d459d5f5777f79b406 upstream. As spec said, 1 indicates no copyright is asserted. Signed-off-by: Wang Xingchao <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: 7487/1: mm: avoid setting nG bit for user mappings that aren't present commit 47f1204329237a0f8655f5a9f14a38ac81946ca1 upstream. Swap entries are encoding in ptes such that !pte_present(pte) and pte_file(pte). The remaining bits of the descriptor are used to identify the swapfile and offset within it to the swap entry. When writing such a pte for a user virtual address, set_pte_at unconditionally sets the nG bit, which (in the case of LPAE) will corrupt the swapfile offset and lead to a BUG: [ 140.494067] swap_free: Unused swap offset entry 000763b4 [ 140.509989] BUG: Bad page map in process rs:main Q:Reg pte:0ec76800 pmd:8f92e003 This patch fixes the problem by only setting the nG bit for user mappings that are actually present. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: 7488/1: mm: use 5 bits for swapfile type encoding commit f5f2025ef3e2cdb593707cbf87378761f17befbe upstream. Page migration encodes the pfn in the offset field of a swp_entry_t. For LPAE, we support physical addresses of up to 36 bits (due to sparsemem limitations with the size of page flags), requiring 24 bits to represent a pfn. A further 3 bits are used to encode a swp_entry into a pte, leaving 5 bits for the type field. Furthermore, the core code defines MAX_SWAPFILES_SHIFT as 5, so the additional type bit does not get used. This patch reduces the width of the type field to 5 bits, allowing us to create up to 31 swapfiles of 64GB each. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: 7489/1: errata: fix workaround for erratum #720789 on UP systems commit 730a8128cd8978467eb1cf546b11014acb57d433 upstream. Commit 5a783cbc4836 ("ARM: 7478/1: errata: extend workaround for erratum #720789") added workarounds for erratum #720789 to the range TLB invalidation functions with the observation that the erratum only affects SMP platforms. However, when running an SMP_ON_UP kernel on a uniprocessor platform we must take care to preserve the ASID as the workaround is not required. This patch ensures that we don't set the ASID to 0 when flushing the TLB on such a system, preserving the original behaviour with the workaround disabled. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: arch/arm/mm/tlb-v7.S ARM: S3C24XX: Fix s3c2410_dma_enqueue parameters commit b01858c7806e7e6f6121da2e51c9222fc4d21dc6 upstream. Commit d670ac0 (ARM: SAMSUNG: DMA Cleanup as per sparse) changed the prototype of the s3c2410_dma_* functions to use the enum dma_ch instead of an generic unsigned int. In the s3c24xx dma.c s3c2410_dma_enqueue seems to have been forgotten, the other functions there were changed correctly. Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Kukjin Kim <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: imx: select CPU_FREQ_TABLE when needed commit f637c4c9405e21f44cf0045eaf77eddd3a79ca5a upstream. The i.MX cpufreq implementation uses the CPU_FREQ_TABLE helpers, so it needs to select that code to be built. This problem has apparently existed since the i.MX cpufreq code was first merged in v2.6.37. Building IMX without CPU_FREQ_TABLE results in: arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_exit': arch/arm/plat-mxc/cpufreq.c:173: undefined reference to `cpufreq_frequency_table_put_attr' arch/arm/plat-mxc/built-in.o: In function `mxc_set_target': arch/arm/plat-mxc/cpufreq.c:84: undefined reference to `cpufreq_frequency_table_target' arch/arm/plat-mxc/built-in.o: In function `mxc_verify_speed': arch/arm/plat-mxc/cpufreq.c:65: undefined reference to `cpufreq_frequency_table_verify' arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_init': arch/arm/plat-mxc/cpufreq.c:154: undefined reference to `cpufreq_frequency_table_cpuinfo' arch/arm/plat-mxc/cpufreq.c:162: undefined reference to `cpufreq_frequency_table_get_attr' Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Shawn Guo <[email protected]> Cc: Sascha Hauer <[email protected]> Cc: Yong Shen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ASoC: wm9712: Fix microphone source selection commit ccf795847a38235ee4a56a24129ce75147d6ba8f upstream. Currently the microphone input source is not selectable as while there is a DAPM widget it's not connected to anything so it won't be properly instantiated. Add something more correct for the input structure to get things going, even though it's not hooked into the rest of the routing map and so won't actually achieve anything except allowing the relevant register bits to be written. Reported-by: Christop Fritz <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> vfs: missed source of ->f_pos races commit 0e665d5d1125f9f4ccff56a75e814f10f88861a2 upstream. compat_sys_{read,write}v() need the same "pass a copy of file->f_pos" thing as sys_{read,write}{,v}(). Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> vfs: canonicalize create mode in build_open_flags() commit e68726ff72cf7ba5e7d789857fcd9a75ca573f03 upstream. Userspace can pass weird create mode in open(2) that we canonicalize to "(mode & S_IALLUGO) | S_IFREG" in vfs_create(). The problem is that we use the uncanonicalized mode before calling vfs_create() with unforseen consequences. So do the canonicalization early in build_open_flags(). Signed-off-by: Miklos Szeredi <[email protected]> Tested-by: Richard W.M. Jones <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> alpha: Don't export SOCK_NONBLOCK to user space. commit a2fa3ccd7b43665fe14cb562761a6c3d26a1d13f upstream. Currently we export SOCK_NONBLOCK to user space but that conflicts with the definition from glibc leading to compilation errors in user programs (e.g. see Debian bug #658460). The generic socket.h restricts the definition of SOCK_NONBLOCK to the kernel, as does the MIPS specific socket.h, so let's do the same on Alpha. Signed-off-by: Michael Cree <[email protected]> Acked-by: Matt Turner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: winbond: remove __devinit* from the struct usb_device_id table commit 43a34695d9cd79c6659f09da6d3b0624f3dd169f upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Pavel Machek <[email protected]> CC: Paul Gortmaker <[email protected]> CC: "John W. Linville" <[email protected]> CC: Eliad Peller <[email protected]> CC: Devendra Naga <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm: hugetlbfs: correctly populate shared pmd commit eb48c071464757414538c68a6033c8f8c15196f8 upstream. Each page mapped in a process's address space must be correctly accounted for in _mapcount. Normally the rules for this are straightforward but hugetlbfs page table sharing is different. The page table pages at the PMD level are reference counted while the mapcount remains the same. If this accounting is wrong, it causes bugs like this one reported by Larry Woodman: kernel BUG at mm/filemap.c:135! invalid opcode: 0000 [Evervolv#1] SMP CPU 22 Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi] Pid: 18001, comm: mpitest Tainted: G W 3.3.0+ Evervolv#4 Dell Inc. PowerEdge R620/07NDJ2 RIP: 0010:[<ffffffff8112cfed>] [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170 Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20) Call Trace: delete_from_page_cache+0x40/0x80 truncate_hugepages+0x115/0x1f0 hugetlbfs_evict_inode+0x18/0x30 evict+0x9f/0x1b0 iput_final+0xe3/0x1e0 iput+0x3e/0x50 d_kill+0xf8/0x110 dput+0xe2/0x1b0 __fput+0x162/0x240 During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc() shared page tables with the check dst_pte == src_pte. The logic is if the PMD page is the same, they must be shared. This assumes that the sharing is between the parent and child. However, if the sharing is with a different process entirely then this check fails as in this diagram: parent | ------------>pmd src_pte----------> data page ^ other--------->pmd--------------------| ^ child-----------| dst_pte For this situation to occur, it must be possible for Parent and Other to have faulted and failed to share page tables with each other. This is possible due to the following style of race. PROC A PROC B copy_hugetlb_page_range copy_hugetlb_page_range src_pte == huge_pte_offset src_pte == huge_pte_offset !src_pte so no sharing !src_pte so no sharing (time passes) hugetlb_fault hugetlb_fault huge_pte_alloc huge_pte_alloc huge_pmd_share huge_pmd_share LOCK(i_mmap_mutex) find nothing, no sharing UNLOCK(i_mmap_mutex) LOCK(i_mmap_mutex) find nothing, no sharing UNLOCK(i_mmap_mutex) pmd_alloc pmd_alloc LOCK(instantiation_mutex) fault UNLOCK(instantiation_mutex) LOCK(instantiation_mutex) fault UNLOCK(instantiation_mutex) These two processes are not poing to the same data page but are not sharing page tables because the opportunity was missed. When either process later forks, the src_pte == dst pte is potentially insufficient. As the check falls through, the wrong PTE information is copied in (harmless but wrong) and the mapcount is bumped for a page mapped by a shared page table leading to the BUG_ON. This patch addresses the issue by moving pmd_alloc into huge_pmd_share which guarantees that the shared pud is populated in the same critical section as pmd. This also means that huge_pte_offset test in huge_pmd_share is serialized correctly now which in turn means that the success of the sharing will be higher as the racing tasks see the pud and pmd populated together. Race identified and changelog written mostly by Mel Gorman. {[email protected]: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style] Reported-by: Larry Woodman <[email protected]> Tested-by: Larry Woodman <[email protected]> Reviewed-by: Mel Gorman <[email protected]> Signed-off-by: Michal Hocko <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: David Gibson <[email protected]> Cc: Ken Chen <[email protected]> Cc: Cong Wang <[email protected]> Cc: Hillf Danton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> NFSv3: Ensure that do_proc_get_root() reports errors correctly commit 086600430493e04b802bee6e5b3ce0458e4eb77f upstream. If the rpc call to NFS3PROC_FSINFO fails, then we need to report that error so that the mount fails. Otherwise we can end up with a superblock with completely unusable values for block sizes, maxfilesize, etc. Reported-by: Yuanming Chen <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done commit 47fbf7976e0b7d9dcdd799e2a1baba19064d9631 upstream. Ever since commit 0a57cdac3f (NFSv4.1 send layoutreturn to fence disconnected data server) we've been sending layoutreturn calls while there is potentially still outstanding I/O to the data servers. The reason we do this is to avoid races between replayed writes to the MDS and the original writes to the DS. When this happens, the BUG_ON() in nfs4_layoutreturn_done can be triggered because it assumes that we would never call layoutreturn without knowing that all I/O to the DS is finished. The fix is to remove the BUG_ON() now that the assumptions behind the test are obsolete. Reported-by: Boaz Harrosh <[email protected]> Reported-by: Tigran Mkrtchyan <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> NFS: Alias the nfs module to nfs4 commit 425e776d93a7a5070b77d4f458a5bab0f924652c upstream. This allows distros to remove the line from their modprobe configuration. Signed-off-by: Bryan Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> audit: don't free_chunk() after fsnotify_add_mark() commit 0fe33aae0e94b4097dd433c9399e16e17d638cd8 upstream. Don't do free_chunk() after fsnotify_add_mark(). That one does a delayed unref via the destroy list and this results in use-after-free. Signed-off-by: Miklos Szeredi <[email protected]> Acked-by: Eric Paris <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> audit: fix refcounting in audit-tree commit a2140fc0cb0325bb6384e788edd27b9a568714e2 upstream. Refcounting of fsnotify_mark in audit tree is broken. E.g: refcount create_chunk alloc_chunk 1 fsnotify_add_mark 2 untag_chunk fsnotify_get_mark 3 fsnotify_destroy_mark audit_tree_freeing_mark 2 fsnotify_put_mark 1 fsnotify_put_mark 0 via destroy_list fsnotify_mark_destroy -1 This was reported by various people as triggering Oops when stopping auditd. We could just remove the put_mark from audit_tree_freeing_mark() but that would break freeing via inode destruction. So this patch simply omits a put_mark after calling destroy_mark or adds a get_mark before. The additional get_mark is necessary where there's no other put_mark after fsnotify_destroy_mark() since it assumes that the caller is holding a reference (or the inode is keeping the mark pinned, not the case here AFAICS). Signed-off-by: Miklos Szeredi <[email protected]> Reported-by: Valentin Avram <[email protected]> Reported-by: Peter Moody <[email protected]> Acked-by: Eric Paris <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> svcrpc: fix BUG() in svc_tcp_clear_pages commit be1e44441a560c43c136a562d49a1c9623c91197 upstream. Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if sk_tcplen would lead us to expect n pages of data). svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen. This is OK, since both functions are serialized by XPT_BUSY. However, that means the inconsistency must be repaired before dropping XPT_BUSY. Therefore we should be ensuring that svc_tcp_save_pages repairs the problem before exiting svc_tcp_recv_record on error. Symptoms were a BUG() in svc_tcp_clear_pages. Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping commit d10f27a750312ed5638c876e4bd6aa83664cccd8 upstream. The rpc server tries to ensure that there will be room to send a reply before it receives a request. It does this by tracking, in xpt_reserved, an upper bound on the total size of the replies that is has already committed to for the socket. Currently it is adding in the estimate for a new reply *before* it checks whether there is space available. If it finds that there is not space, it then subtracts the estimate back out. This may lead the subsequent svc_xprt_enqueue to decide that there is space after all. The results is a svc_recv() that will repeatedly return -EAGAIN, causing server threads to loop without doing any actual work. Reported-by: Michael Tokarev <[email protected]> Tested-by: Michael Tokarev <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> svcrpc: sends on closed socket should stop immediately commit f06f00a24d76e168ecb38d352126fd203937b601 upstream. svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply. However, the XPT_CLOSE won't be acted on immediately. Meanwhile other threads could send further replies before the socket is really shut down. This can manifest as data corruption: for example, if a truncated read reply is followed by another rpc reply, that second reply will look to the client like further read data. Symptoms were data corruption preceded by svc_tcp_sendto logging something like kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket Reported-by: Malahal Naineni <[email protected]> Tested-by: Malahal Naineni <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> cciss: fix incorrect scsi status reporting commit b0cf0b118c90477d1a6811f2cd2307f6a5578362 upstream. Delete code which sets SCSI status incorrectly as it's already been set correctly above this incorrect code. The bug was introduced in 2009 by commit b0e15f6 ("cciss: fix typo that causes scsi status to be lost.") Signed-off-by: Stephen M. Cameron <[email protected]> Reported-by: Roel van Meer <[email protected]> Tested-by: Roel van Meer <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ACPI: export symbol acpi_get_table_with_size commit 4f81f986761a7663db7d24d24cd6ae68008f1fc2 upstream. We need it in the radeon drm module to fetch and verify the vbios image on UEFI systems. Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ath9k: fix decrypt_error initialization in ath_rx_tasklet() commit e1352fde5682ab1bdd2a9e5d75c22d1fe210ef77 upstream. ath_rx_tasklet() calls ath9k_rx_skb_preprocess() and ath9k_rx_skb_postprocess() in a loop over the received frames. The decrypt_error flag is initialized to false just outside ath_rx_tasklet() loop. ath9k_rx_accept(), called by ath9k_rx_skb_preprocess(), only sets decrypt_error to true and never to false. Then ath_rx_tasklet() calls ath9k_rx_skb_postprocess() and passes decrypt_error to it. So, after a decryption error, in ath9k_rx_skb_postprocess(), we can have a leftover value from another processed frame. In that case, the frame will not be marked with RX_FLAG_DECRYPTED even if it is decrypted correctly. When using CCMP encryption this issue can lead to connection stuck because of CCMP PN corruption and a waste of CPU time since mac80211 tries to decrypt an already deciphered frame with ieee80211_aes_ccm_decrypt. Fix the issue initializing decrypt_error flag at the begging of the ath_rx_tasklet() loop. Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> PCI: EHCI: Fix crash during hibernation on ASUS computers commit 0b68c8e2c3afaf9807eb1ebe0ccfb3b809570aa4 upstream. Commit dbf0e4c (PCI: EHCI: fix crash during suspend on ASUS computers) added a workaround for an ASUS suspend issue related to USB EHCI and a bug in a number of ASUS BIOSes that attempt to shut down the EHCI controller during system suspend if its PCI command register doesn't contain 0 at that time. It turns out that the same workaround is necessary in the analogous hibernation code path, so add it. References: https://bugzilla.kernel.org/show_bug.cgi?id=45811 Reported-and-tested-by: Oleksij Rempel <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> block: replace __getblk_slow misfix by grow_dev_page fix commit 676ce6d5ca3098339c028d44fe0427d1566a4d2d upstream. Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") is not good: a successful call to grow_buffers() cannot guarantee that the page won't be reclaimed before the immediate next call to __find_get_block(), which is why there was always a loop there. Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595: inode #19278: block 664: comm cc1: unable to read itable block" on console, which pointed to this commit. I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs sometimes fails from a missing header file, under memory pressure on ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7 itself, despite that commit being in there: bisection pointed to an irrelevant pinctrl merge, but hard to tell when failure takes between 18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2). (I've since found such __ext4_get_inode_loc errors in /var/log/messages from previous weeks: why the message never appeared on console until yesterday morning is a mystery for another day.) Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus a checkpatch nitfix). Simplify the interface between grow_buffers() and grow_dev_page(), and avoid the infinite loop beyond end of device by instead checking init_page_buffers()'s end_block there (I presume that's more efficient than a repeated call to blkdev_max_block()), returning -ENXIO to __getblk_slow() in that case. And remove akpm's ten-year-old "__getblk() cannot fail ... weird" comment, but that is worrying: are all users of __getblk() really now prepared for a NULL bh beyond end of device, or will some oops?? Signed-off-by: Hugh Dickins <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: spca506: remove __devinit* from the struct usb_device_id table commit e694d518886c7afedcdd1732477832b2e32744e4 upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Hans de Goede <[email protected]> CC: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: p54usb: remove __devinit* from the struct usb_device_id table commit b9c4167cbbafddac3462134013bc15e63e4c53ef upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Christian Lamparter <[email protected]> CC: "John W. Linville" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: rtl8187: remove __devinit* from the struct usb_device_id table commit a3433179d0822ccfa8e80aa4d1d52843bd2dcc63 upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Herton Ronaldo Krzesinski <[email protected]> CC: Hin-Tak Leung <[email protected]> CC: Larry Finger <[email protected]> CC: "John W. Linville" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: smsusb: remove __devinit* from the struct usb_device_id table commit d04dbd1c0ec17a13326c8f2279399c225836a79f upstream. This structure needs to always stick around, even if CONFIG_HOTPLUG is disabled, otherwise we can oops when trying to probe a device that was added after the structure is thrown away. Thanks to Fengguang Wu and Bjørn Mork for tracking this issue down. Reported-by: Fengguang Wu <[email protected]> Reported-by: Bjørn Mork <[email protected]> CC: Mauro Carvalho Chehab <[email protected]> CC: Michael Krufky <[email protected]> CC: Paul Gortmaker <[email protected]> CC: Doron Cohen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: CDC ACM: Fix NULL pointer dereference commit 99f347caa4568cb803862730b3b1f1942639523f upstream. If a device specifies zero endpoints in its interface descriptor, the kernel oopses in acm_probe(). Even though that's clearly an invalid descriptor, we should test wether we have all endpoints. This is especially bad as this oops can be triggered by just plugging a USB device in. Signed-off-by: Sven Schnelle <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> powerpc: Fix DSCR inheritance in copy_thread() commit 1021cb268b3025573c4811f1dee4a11260c4507b upstream. If the default DSCR is non zero we set thread.dscr_inherit in copy_thread() meaning the new thread and all its children will ignore future updates to the default DSCR. This is not intended and is a change in behaviour that a number of our users have hit. We just need to inherit thread.dscr and thread.dscr_inherit from the parent which ends up being much simpler. This was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_default_test.c Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> powerpc: Restore correct DSCR in context switch commit 714332858bfd40dcf8f741498336d93875c23aa7 upstream. During a context switch we always restore the per thread DSCR value. If we aren't doing explicit DSCR management (ie thread.dscr_inherit == 0) and the default DSCR changed while the process has been sleeping we end up with the wrong value. Check thread.dscr_inherit and select the default DSCR or per thread DSCR as required. This was found with the following test case, when running with more threads than CPUs (ie forcing context switching): http://ozlabs.org/~anton/junkcode/dscr_default_test.c With the four patches applied I can run a combination of all test cases successfully at the same time: http://ozlabs.org/~anton/junkcode/dscr_default_test.c http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Remove user-triggerable BUG from mpol_to_str commit 80de7c3138ee9fd86a98696fd2cf7ad89b995d0a upstream. Trivially triggerable, found by trinity: kernel BUG at mm/mempolicy.c:2546! Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670) Call Trace: show_numa_map+0xd5/0x450 show_pid_numa_map+0x13/0x20 traverse+0xf2/0x230 seq_read+0x34b/0x3e0 vfs_read+0xac/0x180 sys_pread64+0xa2/0xc0 system_call_fastpath+0x1a/0x1f RIP: mpol_to_str+0x156/0x360 Signed-off-by: Dave Jones <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: megaraid_sas: Move poll_aen_lock initializer commit bd8d6dd43a77bfd2b8fef5b094b9d6095e169dee upstream. The following patch moves the poll_aen_lock initializer from megasas_probe_one() to megasas_init(). This prevents a crash when a user loads the driver and tries to issue a poll() system call on the ioctl interface with no adapters present. Signed-off-by: Kashyap Desai <[email protected]> Signed-off-by: Adam Radford <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: mpt2sas: Fix for Driver oops, when loading driver with max_queue_depth command line option to a very small value commit 338b131a3269881c7431234855c93c219b0979b6 upstream. If the specified max_queue_depth setting is less than the expected number of internal commands, then driver will calculate the queue depth size to a negitive number. This negitive number is actually a very large number because variable is unsigned 16bit integer. So, the driver will ask for a very large amount of memory for message frames and resulting into oops as memory allocation routines will not able to handle such a large request. So, in order to limit this kind of oops, The driver need to set the max_queue_depth to a scsi mid layer's can_queue value. Then the overall message frames required for IO is minimum of either (max_queue_depth plus internal commands) or the IOC global credits. Signed-off-by: Sreekanth Reddy <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: Fix 'Device not ready' issue on mpt2sas commit 14216561e164671ce147458653b1fea06a4ada1e upstream. This is a particularly nasty SCSI ATA Translation Layer (SATL) problem. SAT-2 says (section 8.12.2) if the device is in the stopped state as the result of processing a START STOP UNIT command (see 9.11), then the SATL shall terminate the TEST UNIT READY command with CHECK CONDITION status with the sense key set to NOT READY and the additional sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND REQUIRED; mpt2sas internal SATL seems to implement this. The result is very confusing standby behaviour (using hdparm -y). If you suspend a drive and then send another command, usually it wakes up. However, if the next command is a TEST UNIT READY, the SATL sees that the drive is suspended and proceeds to follow the SATL rules for this, returning NOT READY to all subsequent commands. This means that the ordering of TEST UNIT READY is crucial: if you send TUR and then a command, you get a NOT READY to both back. If you send a command and then a TUR, you get GOOD status because the preceeding command woke the drive. This bit us badly because commit 85ef06d1d252f6a2e73b678591ab71caad4667bb Author: Tejun Heo <[email protected]> Date: Fri Jul 1 16:17:47 2011 +0200 block: flush MEDIA_CHANGE from drivers on close(2) Changed our ordering on TEST UNIT READY commands meaning that SATA drives connected to an mpt2sas now suspend and refuse to wake (because the mpt2sas SATL sees the suspend *before* the drives get awoken by the next ATA command) resulting in lots of failed commands. The standard is completely nuts forcing this inconsistent behaviour, but we have to work around it. The fix for this is twofold: 1. Set the allow_restart flag so we wake the drive when we see it has been suspended 2. Return all TEST UNIT READY status directly to the mid layer without any further error handling which prevents us causing error handling which may offline the device just because of a media check TUR. Reported-by: Matthias Prager <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> udf: Fix data corruption for files in ICB commit 9c2fc0de1a6e638fe58c354a463f544f42a90a09 upstream. When a file is stored in ICB (inode), we overwrite part of the file, and the page containing file's data is not in page cache, we end up corrupting file's data by overwriting them with zeros. The problem is we use simple_write_begin() which simply zeroes parts of the page which are not written to. The problem has been introduced by be021ee (udf: convert to new aops). Fix the problem by providing a ->write_begin function which makes the page properly uptodate. Reported-by: Ian Abbott <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ext3: Fix fdatasync() for files with only i_size changes commit 156bddd8e505b295540f3ca0e27dda68cb0d49aa upstream. Code tracking when transaction needs to be committed on fdatasync(2) forgets to handle a situation when only inode's i_size is changed. Thus in such situations fdatasync(2) doesn't force transaction with new i_size to disk and that can result in wrong i_size after a crash. Fix the issue by updating inode's i_datasync_tid whenever its size is updated. Reported-by: Kristian Nielsen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> fuse: fix retrieve length commit c9e67d483776d8d2a5f3f70491161b205930ffe1 upstream. In some cases fuse_retrieve() would return a short byte count if offset was non-zero. The data returned was correct, though. Signed-off-by: Miklos Szeredi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Input: i8042 - add Gigabyte T1005 series netbooks to noloop table commit 7b125b94ca16b7e618c6241cb02c4c8060cea5e3 upstream. They all define their chassis type as "Other" and therefore are not categorized as "laptops" by the driver, which tries to perform AUX IRQ delivery test which fails and causes touchpad not working. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42620 Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/vmwgfx: add MODULE_DEVICE_TABLE so vmwgfx loads at boot commit c4903429a92be60e6fe59868924a65eca4cd1a38 upstream. This will cause udev to load vmwgfx instead of waiting for X to do it. Reviewed-by: Jakob Bornecrantz <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> PARISC: Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts commit bba3d8c3b3c0f2123be5bc687d1cddc13437c923 upstream. The following build error occured during a parisc build with swap-over-NFS patches applied. net/core/sock.c:274:36: error: initializer element is not constant net/core/sock.c:274:36: error: (near initialization for 'memalloc_socks') net/core/sock.c:274:36: error: initializer element is not constant Dave Anglin says: > Here is the line in sock.i: > > struct static_key memalloc_socks = ((struct static_key) { .enabled = > ((atomic_t) { (0) }) }); The above line contains two compound literals. It also uses a designated initializer to initialize the field enabled. A compound literal is not a constant expression. The location of the above statement isn't fully clear, but if a compound literal occurs outside the body of a function, the initializer list must consist of constant expressions. Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> dccp: check ccid before dereferencing commit 276bdb82dedb290511467a5a4fdbe9f0b52dce6f upstream. ccid_hc_rx_getsockopt() and ccid_hc_tx_getsockopt() might be called with a NULL ccid pointer leading to a NULL pointer dereference. This could lead to a privilege escalation if the attacker is able to map page 0 and prepare it with a fake ccid_ops pointer. Signed-off-by: Mathias Krause <[email protected]> Cc: Gerrit Renker <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> hwmon: (asus_atk0110) Add quirk for Asus M5A78L commit 43ca6cb28c871f2fbad10117b0648e5ae3b0f638 upstream. The old interface is bugged and reads the wrong sensor when retrieving the reading for the chassis fan (it reads the CPU sensor); the new interface works fine. Reported-by: Göran Uddeborg <[email protected]> Tested-by: Göran Uddeborg <[email protected]> Signed-off-by: Luca Tettamanti <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Linux 3.0.43 arm: mm: merge derp (again) Conflicts: arch/arm/mm/tlb-v7.S

net: Allow driver to limit number of GSO segments per skb [ Upstream commit 30b678d844af3305cda5953467005cebb5d7b687 ] A peer (or local user) may cause TCP to use a nominal MSS of as little as 88 (actual MSS of 76 with timestamps). Given that we have a sufficiently prodigious local sender and the peer ACKs quickly enough, it is nevertheless possible to grow the window for such a connection to the point that we will try to send just under 64K at once. This results in a single skb that expands to 861 segments. In some drivers with TSO support, such an skb will require hundreds of DMA descriptors; a substantial fraction of a TX ring or even more than a full ring. The TX queue selected for the skb may stall and trigger the TX watchdog repeatedly (since the problem skb will be retried after the TX reset). This particularly affects sfc, for which the issue is designated as CVE-2012-3412. Therefore: 1. Add the field net_device::gso_max_segs holding the device-specific limit. 2. In netif_skb_features(), if the number of segments is too high then mask out GSO features to force fall back to software GSO. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> sfc: Fix maximum number of TSO segments and minimum TX queue size [ Upstream commit 7e6d06f0de3f74ca929441add094518ae332257c ] Currently an skb requiring TSO may not fit within a minimum-size TX queue. The TX queue selected for the skb may stall and trigger the TX watchdog repeatedly (since the problem skb will be retried after the TX reset). This issue is designated as CVE-2012-3412. Set the maximum number of TSO segments for our devices to 100. This should make no difference to behaviour unless the actual MSS is less than about 700. Increase the minimum TX queue size accordingly to allow for 2 worst-case skbs, so that there will definitely be space to add an skb after we wake a queue. To avoid invalidating existing configurations, change efx_ethtool_set_ringparam() to fix up values that are too small rather than returning -EINVAL. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tcp: Apply device TSO segment limit earlier [ Upstream commit 1485348d2424e1131ea42efc033cbd9366462b01 ] Cache the device gso_max_segs in sock::sk_gso_max_segs and use it to limit the size of TSO skbs. This avoids the need to fall back to software GSO for local TCP senders. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net_sched: gact: Fix potential panic in tcf_gact(). [ Upstream commit 696ecdc10622d86541f2e35cc16e15b6b3b1b67e ] gact_rand array is accessed by gact->tcfg_ptype whose value is assumed to less than MAX_RAND, but any range checks are not performed. So add a check in tcf_gact_init(). And in tcf_gact(), we can reduce a branch. Signed-off-by: Hiroaki SHIMODA <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> isdnloop: fix and simplify isdnloop_init() [ Upstream commit 77f00f6324cb97cf1df6f9c4aaeea6ada23abdb2 ] Fix a buffer overflow bug by removing the revision and printk. [ 22.016214] isdnloop-ISDN-driver Rev 1.11.6.7 [ 22.097508] isdnloop: (loop0) virtual card added [ 22.174400] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff83244972 [ 22.174400] [ 22.436157] Pid: 1, comm: swapper Not tainted 3.5.0-bisect-00018-gfa8bbb1-dirty #129 [ 22.624071] Call Trace: [ 22.720558] [<ffffffff832448c3>] ? CallcNew+0x56/0x56 [ 22.815248] [<ffffffff8222b623>] panic+0x110/0x329 [ 22.914330] [<ffffffff83244972>] ? isdnloop_init+0xaf/0xb1 [ 23.014800] [<ffffffff832448c3>] ? CallcNew+0x56/0x56 [ 23.090763] [<ffffffff8108e24b>] __stack_chk_fail+0x2b/0x30 [ 23.185748] [<ffffffff83244972>] isdnloop_init+0xaf/0xb1 Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net/core: Fix potential memory leak in dev_set_alias() [ Upstream commit 7364e445f62825758fa61195d237a5b8ecdd06ec ] Do not leak memory by updating pointer with potentially NULL realloc return value. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> af_packet: remove BUG statement in tpacket_destruct_skb [ Upstream commit 7f5c3e3a80e6654cf48dfba7cf94f88c6b505467 ] Here's a quote of the comment about the BUG macro from asm-generic/bug.h: Don't use BUG() or BUG_ON() unless there's really no way out; one example might be detecting data structure corruption in the middle of an operation that can't be backed out of. If the (sub)system can somehow continue operating, perhaps with reduced functionality, it's probably not BUG-worthy. If you're tempted to BUG(), think again: is completely giving up really the *only* solution? There are usually better options, where users don't need to reboot ASAP and can mostly shut down cleanly. In our case, the status flag of a ring buffer slot is managed from both sides, the kernel space and the user space. This means that even though the kernel side might work as expected, the user space screws up and changes this flag right between the send(2) is triggered when the flag is changed to TP_STATUS_SENDING and a given skb is destructed after some time. Then, this will hit the BUG macro. As David suggested, the best solution is to simply remove this statement since it cannot be used for kernel side internal consistency checks. I've tested it and the system still behaves /stable/ in this case, so in accordance with the above comment, we should rather remove it. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipv6: addrconf: Avoid calling netdevice notifiers with RCU read-side lock [ Upstream commit 4acd4945cd1e1f92b20d14e349c6c6a52acbd42d ] Cong Wang reports that lockdep detected suspicious RCU usage while enabling IPV6 forwarding: [ 1123.310275] =============================== [ 1123.442202] [ INFO: suspicious RCU usage. ] [ 1123.558207] 3.6.0-rc1+ #109 Not tainted [ 1123.665204] ------------------------------- [ 1123.768254] include/linux/rcupdate.h:430 Illegal context switch in RCU read-side critical section! [ 1123.992320] [ 1123.992320] other info that might help us debug this: [ 1123.992320] [ 1124.307382] [ 1124.307382] rcu_scheduler_active = 1, debug_locks = 0 [ 1124.522220] 2 locks held by sysctl/5710: [ 1124.648364] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81768498>] rtnl_trylock+0x15/0x17 [ 1124.882211] #1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81871df8>] rcu_lock_acquire+0x0/0x29 [ 1125.085209] [ 1125.085209] stack backtrace: [ 1125.332213] Pid: 5710, comm: sysctl Not tainted 3.6.0-rc1+ #109 [ 1125.441291] Call Trace: [ 1125.545281] [<ffffffff8109d915>] lockdep_rcu_suspicious+0x109/0x112 [ 1125.667212] [<ffffffff8107c240>] rcu_preempt_sleep_check+0x45/0x47 [ 1125.781838] [<ffffffff8107c260>] __might_sleep+0x1e/0x19b [...] [ 1127.445223] [<ffffffff81757ac5>] call_netdevice_notifiers+0x4a/0x4f [...] [ 1127.772188] [<ffffffff8175e125>] dev_disable_lro+0x32/0x6b [ 1127.885174] [<ffffffff81872d26>] dev_forward_change+0x30/0xcb [ 1128.013214] [<ffffffff818738c4>] addrconf_forward_change+0x85/0xc5 [...] addrconf_forward_change() uses RCU iteration over the netdev list, which is unnecessary since it already holds the RTNL lock. We also cannot reasonably require netdevice notifier functions not to sleep. Reported-by: Cong Wang <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> atm: fix info leak in getsockopt(SO_ATMPVC) [ Upstream commit e862f1a9b7df4e8196ebec45ac62295138aa3fc2 ] The ATM code fails to initialize the two padding bytes of struct sockaddr_atmpvc inserted for alignment. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> atm: fix info leak via getsockname() [ Upstream commit 3c0c5cfdcd4d69ffc4b9c0907cec99039f30a50a ] The ATM code fails to initialize the two padding bytes of struct sockaddr_atmpvc inserted for alignment. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Bluetooth: HCI - Fix info leak in getsockopt(HCI_FILTER) [ Upstream commit e15ca9a0ef9a86f0477530b0f44a725d67f889ee ] The HCI code fails to initialize the two padding bytes of struct hci_ufilter before copying it to userland -- that for leaking two bytes kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Cc: Marcel Holtmann <[email protected]> Cc: Gustavo Padovan <[email protected]> Cc: Johan Hedberg <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Bluetooth: HCI - Fix info leak via getsockname() [ Upstream commit 3f68ba07b1da811bf383b4b701b129bfcb2e4988 ] The HCI code fails to initialize the hci_channel member of struct sockaddr_hci and that for leaks two bytes kernel stack via the getsockname() syscall. Initialize hci_channel with 0 to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Cc: Marcel Holtmann <[email protected]> Cc: Gustavo Padovan <[email protected]> Cc: Johan Hedberg <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Bluetooth: RFCOMM - Fix info leak in ioctl(RFCOMMGETDEVLIST) [ Upstream commit f9432c5ec8b1e9a09b9b0e5569e3c73db8de432a ] The RFCOMM code fails to initialize the two padding bytes of struct rfcomm_dev_list_req inserted for alignment before copying it to userland. Additionally there are two padding bytes in each instance of struct rfcomm_dev_info. The ioctl() that for disclosures two bytes plus dev_num times two bytes uninitialized kernel heap memory. Allocate the memory using kzalloc() to fix this issue. Signed-off-by: Mathias Krause <[email protected]> Cc: Marcel Holtmann <[email protected]> Cc: Gustavo Padovan <[email protected]> Cc: Johan Hedberg <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Bluetooth: RFCOMM - Fix info leak via getsockname() [ Upstream commit 9344a972961d1a6d2c04d9008b13617bcb6ec2ef ] The RFCOMM code fails to initialize the trailing padding byte of struct sockaddr_rc added for alignment. It that for leaks one byte kernel stack via the getsockname() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Cc: Marcel Holtmann <[email protected]> Cc: Gustavo Padovan <[email protected]> Cc: Johan Hedberg <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Bluetooth: L2CAP - Fix info leak via getsockname() [ Upstream commit 792039c73cf176c8e39a6e8beef2c94ff46522ed ] The L2CAP code fails to initialize the l2_bdaddr_type member of struct sockaddr_l2 and the padding byte added for alignment. It that for leaks two bytes kernel stack via the getsockname() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Cc: Marcel Holtmann <[email protected]> Cc: Gustavo Padovan <[email protected]> Cc: Johan Hedberg <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> llc: fix info leak via getsockname() [ Upstream commit 3592aaeb80290bda0f2cf0b5456c97bfc638b192 ] The LLC code wrongly returns 0, i.e. "success", when the socket is zapped. Together with the uninitialized uaddrlen pointer argument from sys_getsockname this leads to an arbitrary memory leak of up to 128 bytes kernel stack via the getsockname() syscall. Return an error instead when the socket is zapped to prevent the info leak. Also remove the unnecessary memset(0). We don't directly write to the memory pointed by uaddr but memcpy() a local structure at the end of the function that is properly initialized. Signed-off-by: Mathias Krause <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> dccp: fix info leak via getsockopt(DCCP_SOCKOPT_CCID_TX_INFO) [ Upstream commit 7b07f8eb75aa3097cdfd4f6eac3da49db787381d ] The CCID3 code fails to initialize the trailing padding bytes of struct tfrc_tx_info added for alignment on 64 bit architectures. It that for potentially leaks four bytes kernel stack via the getsockopt() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Cc: Gerrit Renker <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipvs: fix info leak in getsockopt(IP_VS_SO_GET_TIMEOUT) [ Upstream commit 2d8a041b7bfe1097af21441cb77d6af95f4f4680 ] If at least one of CONFIG_IP_VS_PROTO_TCP or CONFIG_IP_VS_PROTO_UDP is not set, __ip_vs_get_timeouts() does not fully initialize the structure that gets copied to userland and that for leaks up to 12 bytes of kernel stack. Add an explicit memset(0) before passing the structure to __ip_vs_get_timeouts() to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Cc: Wensong Zhang <[email protected]> Cc: Simon Horman <[email protected]> Cc: Julian Anastasov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: fix info leak in compat dev_ifconf() [ Upstream commit 43da5f2e0d0c69ded3d51907d9552310a6b545e8 ] The implementation of dev_ifconf() for the compat ioctl interface uses an intermediate ifc structure allocated in userland for the duration of the syscall. Though, it fails to initialize the padding bytes inserted for alignment and that for leaks four bytes of kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netlink: fix possible spoofing from non-root processes [ Upstream commit 20e1db19db5d6b9e4e83021595eab0dc8f107bef ] Non-root user-space processes can send Netlink messages to other processes that are well-known for being subscribed to Netlink asynchronous notifications. This allows ilegitimate non-root process to send forged messages to Netlink subscribers. The userspace process usually verifies the legitimate origin in two ways: a) Socket credentials. If UID != 0, then the message comes from some ilegitimate process and the message needs to be dropped. b) Netlink portID. In general, portID == 0 means that the origin of the messages comes from the kernel. Thus, discarding any message not coming from the kernel. However, ctnetlink sets the portID in event messages that has been triggered by some user-space process, eg. conntrack utility. So other processes subscribed to ctnetlink events, eg. conntrackd, know that the event was triggered by some user-space action. Neither of the two ways to discard ilegitimate messages coming from non-root processes can help for ctnetlink. This patch adds capability validation in case that dst_pid is set in netlink_sendmsg(). This approach is aggressive since existing applications using any Netlink bus to deliver messages between two user-space processes will break. Note that the exception is NETLINK_USERSOCK, since it is reserved for netlink-to-netlink userspace communication. Still, if anyone wants that his Netlink bus allows netlink-to-netlink userspace, then they can set NL_NONROOT_SEND. However, by default, I don't think it makes sense to allow to use NETLINK_ROUTE to communicate two processes that are sending no matter what information that is not related to link/neighbouring/routing. They should be using NETLINK_USERSOCK instead for that. Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> l2tp: avoid to use synchronize_rcu in tunnel free function [ Upstream commit 99469c32f79a32d8481f87be0d3c66dad286f4ec ] Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be atomic. Signed-off-by: Dmitry Kozlov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: ipv4: ipmr_expire_timer causes crash when removing net namespace [ Upstream commit acbb219d5f53821b2d0080d047800410c0420ea1 ] When tearing down a net namespace, ipv4 mr_table structures are freed without first deactivating their timers. This can result in a crash in run_timer_softirq. This patch mimics the corresponding behaviour in ipv6. Locking and synchronization seem to be adequate. We are about to kfree mrt, so existing code should already make sure that no other references to mrt are pending or can be created by incoming traffic. The functions invoked here do not cause new references to mrt or other race conditions to be created. Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive. Both ipmr_expire_process (whose completion we may have to wait in del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock or other synchronizations when needed, and they both only modify mrt. Tested in Linux 3.4.8. Signed-off-by: Francesco Ruggeri <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> workqueue: reimplement work_on_cpu() using system_wq commit ed48ece27cd3d5ee0354c32bbaec0f3e1d4715c3 upstream. The existing work_on_cpu() implementation is hugely inefficient. It creates a new kthread, execute that single function and then let the kthread die on each invocation. Now that system_wq can handle concurrent executions, there's no advantage of doing this. Reimplement work_on_cpu() using system_wq which makes it simpler and way more efficient. stable: While this isn't a fix in itself, it's needed to fix a workqueue related bug in cpufreq/powernow-k8. AFAICS, this shouldn't break other existing users. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Jiri Kosina <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Len Brown <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU commit 6889125b8b4e09c5e53e6ecab3433bed1ce198c9 upstream. powernowk8_target() runs off a per-cpu work item and if the cpufreq_policy->cpu is different from the current one, it migrates the kworker to the target CPU by manipulating current->cpus_allowed. The function migrates the kworker back to the original CPU but this is still broken. Workqueue concurrency management requires the kworkers to stay on the same CPU and powernowk8_target() ends up triggerring BUG_ON(rq != this_rq()) in try_to_wake_up_local() if it contends on fidvid_mutex and sleeps. It is unclear why this bug is being reported now. Duncan says it appeared to be a regression of 3.6-rc1 and couldn't reproduce it on 3.5. Bisection seemed to point to 63d95a91 "workqueue: use @pool instead of @gcwq or @cpu where applicable" which is an non-functional change. Given that the reproduce case sometimes took upto days to trigger, it's easy to be misled while bisecting. Maybe something made contention on fidvid_mutex more likely? I don't know. This patch fixes the bug by using work_on_cpu() instead if @pol->cpu isn't the same as the current one. The code assumes that cpufreq_policy->cpu is kept online by the caller, which Rafael tells me is the case. stable: ed48ece27c ("workqueue: reimplement work_on_cpu() using system_wq") should be applied before this; otherwise, the behavior could be horrible. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Duncan <[email protected]> Tested-by: Duncan <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Andreas Herrmann <[email protected]> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47301 Signed-off-by: Greg Kroah-Hartman <[email protected]> cciss: fix handling of protocol error commit 2453f5f992717251cfadab6184fbb3ec2f2e8b40 upstream. If a command completes with a status of CMD_PROTOCOL_ERR, this information should be conveyed to the SCSI mid layer, not dropped on the floor. Unlike a similar bug in the hpsa driver, this bug only affects tape drives and CD and DVD ROM drives in the cciss driver, and to induce it, you have to disconnect (or damage) a cable, so it is not a very likely scenario (which would explain why the bug has gone undetected for the last 10 years.) Signed-off-by: Stephen M. Cameron <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> vfs: make O_PATH file descriptors usable for 'fstat()' commit 55815f70147dcfa3ead5738fd56d3574e2e3c1c2 upstream. We already use them for openat() and friends, but fstat() also wants to be able to use O_PATH file descriptors. This should make it more directly comparable to the O_SEARCH of Solaris. Note that you could already do the same thing with "fstatat()" and an empty path, but just doing "fstat()" directly is simpler and faster, so there is no reason not to just allow it directly. See also commit 332a2e1244bd, which did the same thing for fchdir, for the same reasons. Reported-by: ольга крыжановская <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill() commit b161dfa6937ae46d50adce8a7c6b12233e96e7bd upstream. IBM reported a soft lockup after applying the fix for the rename_lock deadlock. Commit c83ce989cb5f ("VFS: Fix the nfs sillyrename regression in kernel 2.6.38") was found to be the culprit. The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the dentry was killed. This flag can be set on non-killed dentries too, which results in infinite retries when trying to traverse the dentry tree. This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is only set in d_kill() and makes try_to_ascend() test only this flag. IBM reported successful test results with this patch. Signed-off-by: Miklos Szeredi <[email protected]> Cc: Trond Myklebust <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netconsole: remove a redundant netconsole_target_put() commit 72d3eb13b5c0abe7d63efac41f39c5b644c7bbaa upstream. This netconsole_target_put() is obviously redundant, and it causes a kernel segfault when removing a bridge device which has netconsole running on it. This is caused by: commit 8d8fc29d02a33e4bd5f4fa47823c1fd386346093 Author: Amerigo Wang <[email protected]> Date: Thu May 19 21:39:10 2011 +0000 netpoll: disable netpoll when enslave a device Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> eCryptfs: Copy up attributes of the lower target inode after rename commit 8335eafc2859e1a26282bef7c3d19f3d68868b8a upstream. After calling into the lower filesystem to do a rename, the lower target inode's attributes were not copied up to the eCryptfs target inode. This resulted in the eCryptfs target inode staying around, rather than being evicted, because i_nlink was not updated for the eCryptfs inode. This also meant that eCryptfs didn't do the final iput() on the lower target inode so it stayed around, as well. This would result in a failure to free up space occupied by the target file in the rename() operation. Both target inodes would eventually be evicted when the eCryptfs filesystem was unmounted. This patch calls fsstack_copy_attr_all() after the lower filesystem does its ->rename() so that important inode attributes, such as i_nlink, are updated at the eCryptfs layer. ecryptfs_evict_inode() is now called and eCryptfs can drop its final reference on the lower inode. http://launchpad.net/bugs/561129 Signed-off-by: Tyler Hicks <[email protected]> Tested-by: Colin Ian King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> target: Fix ->data_length re-assignment bug with SCSI overflow commit 4c054ba63ad47ef244cfcfa1cea38134620a5bae upstream. This patch fixes a long-standing bug with SCSI overflow handling where se_cmd->data_length was incorrectly being re-assigned to the larger CDB extracted allocation length, resulting in a number of fabric level errors that would end up causing a session reset in most cases. So instead now: - Only re-assign se_cmd->data_length durining UNDERFLOW (to use the smaller value) - Use existing se_cmd->data_length for OVERFLOW (to use the smaller value) This fix has been tested with the following CDB to generate an SCSI overflow: sg_raw -r512 /dev/sdc 28 0 0 0 0 0 0 0 9 0 Tested using iscsi-target, tcm_qla2xxx, loopback and tcm_vhost fabric ports. Here is a bit more detail on each case: - iscsi-target: Bug with open-iscsi with overflow, sg_raw returns -3584 bytes of data. - tcm_qla2xxx: Working as expected, returnins 512 bytes of data - loopback: sg_raw returns CHECK_CONDITION, from overflow rejection in transport_generic_map_mem_to_cmd() - tcm_vhost: Same as loopback Reported-by: Roland Dreier <[email protected]> Cc: Roland Dreier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Boaz Harrosh <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: ice1724: Use linear scale for AK4396 volume control. commit 3737e2be505d872bf2b3c1cd4151b2d2b413d7b5 upstream. The AK4396 DAC has a linear-scale attentuator, but sound/pci/ice1712/prodigy_hifi.c used a log scale instead, which is not quite right. This patch restores the correct scale, borrowing from the ak4396 code in sound/pci/oxygen/oxygen.c. Signed-off-by: Matteo Frigo <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Staging: speakup: fix an improperly-declared variable. commit 4ea418b8b2fa8a70d0fcc8231b65e67b3a72984b upstream. A local static variable was declared as a pointer to a string constant. We're assigning to the underlying memory, so it needs to be an array instead. Signed-off-by: Christopher Brannon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: vt6656: [BUG] - Failed connection, incorrect endian. commit aa209eef3ce8419ff2926c2fa944dfbfb5afbacb upstream. Hi, This patch fixes a bug with driver failing to negotiate a connection. The bug was traced to commit 203e4615ee9d9fa8d3506b9d0ef30095e4d5bc90 staging: vt6656: removed custom definitions of Ethernet packet types In that patch, definitions in include/linux/if_ether.h replaced ones in tether.h which had both big and little endian definitions. include/linux/if_ether.h only refers to big endian values, cpu_to_be16 should be used for the correct endian architectures. Signed-off-by: Malcolm Priestley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: r8712u: fix bug in r8712_recv_indicatepkt() commit abf02cfc179bb4bd30d05f582d61b3b8f429b813 upstream. 64bit arches have a buggy r8712u driver, let's fix it. skb->tail must be set properly or network stack behavior is undefined. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=847525 Addresses https://bugzilla.kernel.org/show_bug.cgi?id=45071 Signed-off-by: Eric Dumazet <[email protected]> Cc: Dave Jones <[email protected]> Acked-by: Larry Finger <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: comedi: das08: Correct AO output for das08jr-16-ao commit 61ed59ed09e6ad2b8395178ea5ad5f653bba08e3 upstream. Don't zero out bits 15..12 of the data value in `das08jr_ao_winsn()` as that knobbles the upper three-quarters of the output range for the 'das08jr-16-ao' board. Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: option: replace ZTE K5006-Z entry with vendor class rule commit ba9edaa468869a8cea242a411066b0f490751798 upstream. Fix the ZTE K5006-Z entry so that it actually matches anything commit f1b5c997 USB: option: add ZTE K5006-Z added a device specific entry assuming that the device would use class/subclass/proto == ff/ff/ff like other ZTE devices. It turns out that ZTE has started using vendor specific subclass and protocol codes: T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=19d2 ProdID=1018 Rev= 0.00 S: Manufacturer=ZTE,Incorporated S: Product=ZTE LTE Technologies MSM S: SerialNumber=MF821Vxxxxxxx C:* #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=86 Prot=10 Driver=(none) E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=02 Prot=05 Driver=(none) E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=02 Prot=01 Driver=(none) E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=06 Prot=00 Driver=qmi_wwan E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms We do not have any information on how ZTE intend to use these codes, but let us assume for now that the 3 sets matching serial functions in the K5006-Z always will identify a serial function in a ZTE device. Cc: Thomas Schäfer <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> perf_event: Switch to internal refcount, fix race with close() commit a6fa941d94b411bbd2b6421ffbde6db3c93e65ab upstream. Don't mess with file refcounts (or keep a reference to file, for that matter) in perf_event. Use explicit refcount of its own instead. Deal with the race between the final reference to event going away and new children getting created for it by use of atomic_long_inc_not_zero() in inherit_event(); just have the latter free what it had allocated and return NULL, that works out just fine (children of siblings of something doomed are created as singletons, same as if the child of leader had been created and immediately killed). Signed-off-by: Al Viro <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mmc: mxs-mmc: fix deadlock in SDIO IRQ case commit 1af36b2a993dddfa3d6860ec4879c9e8abc9b976 upstream. Release the lock before mmc_signal_sdio_irq is called by mxs_mmc_irq_handler. Backtrace: [ 79.660000] ============================================= [ 79.660000] [ INFO: possible recursive locking detected ] [ 79.660000] 3.4.0-00009-g3e96082-dirty #11 Not tainted [ 79.660000] --------------------------------------------- [ 79.660000] swapper/0 is trying to acquire lock: [ 79.660000] (&(&host->lock)->rlock#2){-.....}, at: [<c026ea3c>] mxs_mmc_enable_sdio_irq+0x18/0xd4 [ 79.660000] [ 79.660000] but task is already holding lock: [ 79.660000] (&(&host->lock)->rlock#2){-.....}, at: [<c026f744>] mxs_mmc_irq_handler+0x1c/0xe8 [ 79.660000] [ 79.660000] other info that might help us debug this: [ 79.660000] Possible unsafe locking scenario: [ 79.660000] [ 79.660000] CPU0 [ 79.660000] ---- [ 79.660000] lock(&(&host->lock)->rlock#2); [ 79.660000] lock(&(&host->lock)->rlock#2); [ 79.660000] [ 79.660000] *** DEADLOCK *** [ 79.660000] [ 79.660000] May be due to missing lock nesting notation [ 79.660000] [ 79.660000] 1 lock held by swapper/0: [ 79.660000] #0: (&(&host->lock)->rlock#2){-.....}, at: [<c026f744>] mxs_mmc_irq_handler+0x1c/0xe8 [ 79.660000] [ 79.660000] stack backtrace: [ 79.660000] [<c0014bd0>] (unwind_backtrace+0x0/0xf4) from [<c005f9c0>] (__lock_acquire+0x1948/0x1d48) [ 79.660000] [<c005f9c0>] (__lock_acquire+0x1948/0x1d48) from [<c005fea0>] (lock_acquire+0xe0/0xf8) [ 79.660000] [<c005fea0>] (lock_acquire+0xe0/0xf8) from [<c03a8460>] (_raw_spin_lock_irqsave+0x44/0x58) [ 79.660000] [<c03a8460>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) [ 79.660000] [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) from [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) [ 79.660000] [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) from [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254) [ 79.660000] [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254) from [<c006bff8>] (handle_irq_event+0x3c/0x5c) [ 79.660000] [<c006bff8>] (handle_irq_event+0x3c/0x5c) from [<c006e6d0>] (handle_level_irq+0x90/0x110) [ 79.660000] [<c006e6d0>] (handle_level_irq+0x90/0x110) from [<c006b930>] (generic_handle_irq+0x38/0x50) [ 79.660000] [<c006b930>] (generic_handle_irq+0x38/0x50) from [<c00102fc>] (handle_IRQ+0x30/0x84) [ 79.660000] [<c00102fc>] (handle_IRQ+0x30/0x84) from [<c000f058>] (__irq_svc+0x38/0x60) [ 79.660000] [<c000f058>] (__irq_svc+0x38/0x60) from [<c0010520>] (default_idle+0x2c/0x40) [ 79.660000] [<c0010520>] (default_idle+0x2c/0x40) from [<c0010a90>] (cpu_idle+0x64/0xcc) [ 79.660000] [<c0010a90>] (cpu_idle+0x64/0xcc) from [<c04ff858>] (start_kernel+0x244/0x2c8) [ 79.660000] BUG: spinlock lockup on CPU#0, swapper/0 [ 79.660000] lock: c398cb2c, .magic: dead4ead, .owner: swapper/0, .owner_cpu: 0 [ 79.660000] [<c0014bd0>] (unwind_backtrace+0x0/0xf4) from [<c01ddb1c>] (do_raw_spin_lock+0xf0/0x144) [ 79.660000] [<c01ddb1c>] (do_raw_spin_lock+0xf0/0x144) from [<c03a8468>] (_raw_spin_lock_irqsave+0x4c/0x58) [ 79.660000] [<c03a8468>] (_raw_spin_lock_irqsave+0x4c/0x58) from [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) [ 79.660000] [<c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) from [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) [ 79.660000] [<c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) from [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254) [ 79.660000] [<c006bdd8>] (handle_irq_event_percpu+0x70/0x254) from [<c006bff8>] (handle_irq_event+0x3c/0x5c) [ 79.660000] [<c006bff8>] (handle_irq_event+0x3c/0x5c) from [<c006e6d0>] (handle_level_irq+0x90/0x110) [ 79.660000] [<c006e6d0>] (handle_level_irq+0x90/0x110) from [<c006b930>] (generic_handle_irq+0x38/0x50) [ 79.660000] [<c006b930>] (generic_handle_irq+0x38/0x50) from [<c00102fc>] (handle_IRQ+0x30/0x84) [ 79.660000] [<c00102fc>] (handle_IRQ+0x30/0x84) from [<c000f058>] (__irq_svc+0x38/0x60) [ 79.660000] [<c000f058>] (__irq_svc+0x38/0x60) from [<c0010520>] (default_idle+0x2c/0x40) [ 79.660000] [<c0010520>] (default_idle+0x2c/0x40) from [<c0010a90>] (cpu_idle+0x64/0xcc) [ 79.660000] [<c0010a90>] (cpu_idle+0x64/0xcc) from [<c04ff858>] (start_kernel+0x244/0x2c8) Signed-off-by: Lauri Hintsala <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Chris Ball <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mmc: sdhci-esdhc: break out early if clock is 0 commit 74f330bceaa7b88d06062e1cac3d519a3dfc041e upstream. Since commit 30832ab56 ("mmc: sdhci: Always pass clock request value zero to set_clock host op") was merged, esdhc_set_clock starts hitting "if (clock == 0)" where ESDHC_SYSTEM_CONTROL has been operated. This causes SDHCI card-detection function being broken. Fix the regression by moving "if (clock == 0)" above ESDHC_SYSTEM_CONTROL operation. Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Chris Ball <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ahci: Add alternate identifier for the 88SE9172 commit 17c60c6b763cb5b83b0185e7d38d01d18e55a05a upstream. This can also appear as 0x9192. Reported in bugzilla and confirmed with the board documentation for these boards. Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=42970 Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Jeff Garzik <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> kobject: fix oops with "input0: bad kobj_uevent_env content in show_uevent()" commit 60e233a56609fd963c59e99bd75c663d63fa91b6 upstream. Fengguang Wu <[email protected]> writes: > After the __devinit* removal series, I can still get kernel panic in > show_uevent(). So there are more sources of bug.. > > Debug patch: > > @@ -343,8 +343,11 @@ static ssize_t show_uevent(struct device > goto out; > > /* copy keys to file */ > - for (i = 0; i < env->envp_idx; i++) > + dev_err(dev, "uevent %d env[%d]: %s/.../%s\n", env->buflen, env->envp_idx, top_kobj->name, dev->kobj.name); > + for (i = 0; i < env->envp_idx; i++) { > + printk(KERN_ERR "uevent %d env[%d]: %s\n", (int)count, i, env->envp[i]); > count += sprintf(&buf[count], "%s\n", env->envp[i]); > + } > > Oops message, the env[] is again not properly initilized: > > [ 44.068623] input input0: uevent 61 env[805306368]: input0/.../input0 > [ 44.069552] uevent 0 env[0]: (null) This is a completely different CONFIG_HOTPLUG problem, only demonstrating another reason why CONFIG_HOTPLUG should go away. I had a hard time trying to disable it anyway ;-) The problem this time is lots of code assuming that a call to add_uevent_var() will guarantee that env->buflen > 0. This is not true if CONFIG_HOTPLUG is unset. So things like this end up overwriting env->envp_idx because the array index is -1: if (add_uevent_var(env, "MODALIAS=")) return -ENOMEM; len = input_print_modalias(&env->buf[env->buflen - 1], sizeof(env->buf) - env->buflen, dev, 0); Don't know what the best action is, given that there seem to be a *lot* of this around the kernel. This patch "fixes" the problem for me, but I don't know if it can be considered an appropriate fix. [ It is the correct fix for now, for 3.7 forcing CONFIG_HOTPLUG to always be on is the longterm fix, but it's too late for 3.6 and older kernels to resolve this that way - gregkh ] Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Tested-by: Fengguang Wu <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts commit 67a806d9499353fabd5b5ff07337f3aa88a1c3ba upstream. The following build error occurred during an alpha build: net/core/sock.c:274:36: error: initializer element is not constant Dave Anglin says: > Here is the line in sock.i: > > struct static_key memalloc_socks = ((struct static_key) { .enabled = > ((atomic_t) { (0) }) }); The above line contains two compound literals. It also uses a designated initializer to initialize the field enabled. A compound literal is not a constant expression. The location of the above statement isn't fully clear, but if a compound literal occurs outside the body of a function, the initializer list must consist of constant expressions. Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: Michael Cree <[email protected]> Acked-by: Matt Turner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> md: Don't truncate size at 4TB for RAID0 and Linear commit 667a5313ecd7308d79629c0738b0db588b0b0a4e upstream. commit 27a7b260f71439c40546b43588448faac01adb93 md: Fix handling for devices from 2TB to 4TB in 0.90 metadata. changed 0.90 metadata handling to truncated size to 4TB as that is all that 0.90 can record. However for RAID0 and Linear, 0.90 doesn't need to record the size, so this truncation is not needed and causes working arrays to become too small. So avoid the truncation for RAID0 and Linear This bug was introduced in 3.1 and is suitable for any stable kernels from then onwards. As the offending commit was tagged for 'stable', any stable kernel that it was applied to should also get this patch. That includes at least 2.6.32, 2.6.33 and 3.0. (Thanks to Ben Hutchings for providing that list). Signed-off-by: Neil Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm/page_alloc: fix the page address of higher page's buddy calculation commit 0ba8f2d59304dfe69b59c034de723ad80f7ab9ac upstream. The heuristic method for buddy has been introduced since commit 43506fad21ca ("mm/page_alloc.c: simplify calculation of combined index of adjacent buddy lists"). But the page address of higher page's buddy was wrongly calculated, which will lead page_is_buddy to fail for ever. IOW, the heuristic method would be disabled with the wrong page address of higher page's buddy. Calculating the page address of higher page's buddy should be based higher_page with the offset between index of higher page and index of higher page's buddy. Signed-off-by: Haifeng Li <[email protected]> Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Cc: KyongHo Cho <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe commit 8dcebaa9a0ae8a0487f4342f3d56d2cb1c980860 upstream. On some platforms, bootloaders are known to do some interesting RTC programming. Without going into the obscurities as to why this may be the case, suffice it to say the the driver should not make any assumptions about the state of the RTC when the driver loads. In particular, the driver probe should be sure that all interrupts are disabled until otherwise programmed. This was discovered when finding bursty I2C traffic every second on Overo platforms. This I2C overhead was keeping the SoC from hitting deep power states. The cause was found to be the RTC firing every second on the I2C-connected TWL PMIC. Special thanks to Felipe Balbi for suggesting to look for a rogue driver as the source of the I2C traffic rather than the I2C driver itself. Special thanks to Steve Sakoman for helping track down the source of the continuous RTC interrups on the Overo boards. Signed-off-by: Kevin Hilman <[email protected]> Cc: Felipe Balbi <[email protected]> Tested-by: Steve Sakoman <[email protected]> Cc: Alessandro Zummo <[email protected]> Tested-by: Shubhrajyoti Datta <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> hwmon: (twl4030-madc-hwmon) Initialize uninitialized structure elements commit 73d7c119255615a26070f9d6cdb722a166a29015 upstream. twl4030_madc_conversion uses do_avg and type structure elements of twl4030_madc_request. Initialize structure to avoid random operation. Fix for: Coverity CID 200794 Uninitialized scalar variable. Cc: Keerthy <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Acked-by: Jean Delvare <[email protected]> Acked-by: Keerthy <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> can: mcp251x: avoid repeated frame bug commit cab32f39dcc5b35db96497dc0a026b5dea76e4e7 upstream. The MCP2515 has a silicon bug causing repeated frame transmission, see section 5 of MCP2515 Rev. B Silicon Errata Revision G (March 2007). Basically, setting TXBnCTRL.TXREQ in either SPI mode (00 or 11) will eventually cause the bug. The workaround proposed by Microchip is to use mode 00 and send a RTS command on the SPI bus to initiate the transmission. Signed-off-by: Benoît Locher <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm/ia64: fix a memory block size bug commit 05cf96398e1b6502f9e191291b715c7463c9d5dd upstream. I found following definition in include/linux/memory.h, in my IA64 platform, SECTION_SIZE_BITS is equal to 32, and MIN_MEMORY_BLOCK_SIZE will be 0. #define MIN_MEMORY_BLOCK_SIZE (1 << SECTION_SIZE_BITS) Because MIN_MEMORY_BLOCK_SIZE is int type and length of 32bits, so MIN_MEMORY_BLOCK_SIZE(1 << 32) will will equal to 0. Actually when SECTION_SIZE_BITS >= 31, MIN_MEMORY_BLOCK_SIZE will be wrong. This will cause wrong system memory infomation in sysfs. I think it should be: #define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS) And "echo offline > memory0/state" will cause following call trace: kernel BUG at mm/memory_hotplug.c:885! sh[6455]: bugcheck! 0 [1] Pid: 6455, CPU 0, comm: sh psr : 0000101008526030 ifs : 8000000000000fa4 ip : [<a0000001008c40f0>] Not tainted (3.6.0-rc1) ip is at offline_pages+0x210/0xee0 Call Trace: show_stack+0x80/0xa0 show_regs+0x640/0x920 die+0x190/0x2c0 die_if_kernel+0x50/0x80 ia64_bad_break+0x3d0/0x6e0 ia64_native_leave_kernel+0x0/0x270 offline_pages+0x210/0xee0 alloc_pages_current+0x180/0x2a0 Signed-off-by: Jianguo Wu <[email protected]> Signed-off-by: Jiang Liu <[email protected]> Cc: "Luck, Tony" <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> memory hotplug: fix section info double registration bug commit f14851af0ebb32745c6c5a2e400aa0549f9d20df upstream. There may be a bug when registering section info. For example, on my Itanium platform, the pfn range of node0 includes the other nodes, so other nodes' section info will be double registered, and memmap's page count will equal to 3. node0: start_pfn=0x100, spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00 node1: start_pfn=0x80000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x080000-0x100000 node2: start_pfn=0x100000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x100000-0x180000 node3: start_pfn=0x180000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x180000-0x200000 free_all_bootmem_node() register_page_bootmem_info_node() register_page_bootmem_info_section() When hot remove memory, we can't free the memmap's page because page_count() is 2 after put_page_bootmem(). sparse_remove_one_section() free_section_usemap() free_map_bootmem() put_page_bootmem() [[email protected]: add code comment] Signed-off-by: Xishi Qiu <[email protected]> Signed-off-by: Jiang Liu <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xen/boot: Disable NUMA for PV guests. commit 8d54db795dfb1049d45dc34f0dddbc5347ec5642 upstream. The hypervisor is in charge of allocating the proper "NUMA" memory and dealing with the CPU scheduler to keep them bound to the proper NUMA node. The PV guests (and PVHVM) have no inkling of where they run and do not need to know that right now. In the future we will need to inject NUMA configuration data (if a guest spans two or more NUMA nodes) so that the kernel can make the right choices. But those patches are not yet present. In the meantime, disable the NUMA capability in the PV guest, which also fixes a bootup issue. Andre says: "we see Dom0 crashes due to the kernel detecting the NUMA topology not by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA). This will detect the actual NUMA config of the physical machine, but will crash about the mismatch with Dom0's virtual memory. Variation of the theme: Dom0 sees what it's not supposed to see. This happens with the said config option enabled and on a machine where this scanning is still enabled (K8 and Fam10h, not Bulldozer class) We have this dump then: NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10 Scanning NUMA topology in Northbridge 24 Number of physical nodes 4 Node 0 MemBase 0000000000000000 Limit 0000000040000000 Node 1 MemBase 0000000040000000 Limit 0000000138000000 Node 2 MemBase 0000000138000000 Limit 00000001f8000000 Node 3 MemBase 00000001f8000000 Limit 0000000238000000 Initmem setup node 0 0000000000000000-0000000040000000 NODE_DATA [000000003ffd9000 - 000000003fffffff] Initmem setup node 1 0000000040000000-0000000138000000 NODE_DATA [0000000137fd9000 - 0000000137ffffff] Initmem setup node 2 0000000138000000-00000001f8000000 NODE_DATA [00000001f095e000 - 00000001f0984fff] Initmem setup node 3 00000001f8000000-0000000238000000 Cannot find 159744 bytes in node 3 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96 Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar RIP: e030:[<ffffffff81d220e6>] [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96 .. snip.. [<ffffffff81d23024>] sparse_early_usemaps_alloc_node+0x64/0x178 [<ffffffff81d23348>] sparse_init+0xe4/0x25a [<ffffffff81d16840>] paging_init+0x13/0x22 [<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b [<ffffffff81683954>] ? printk+0x3c/0x3e [<ffffffff81d01a38>] start_kernel+0xe5/0x468 [<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1 [<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36 [<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c " so we just disable NUMA scanning by setting numa_off=1. Reported-and-Tested-by: Andre Przywara <[email protected]> Acked-by: Andre Przywara <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> hwmon: (fam15h_power) Tweak runavg_range on resume commit 5f0ecb907deb1e6f28071ee3bd568903b9da1be4 upstream. The quirk introduced with commit 00250ec90963b7ef6678438888f3244985ecde14 (hwmon: fam15h_power: fix bogus values with current BIOSes) is not only required during driver load but also when system resumes from suspend. The BIOS might set the previously recommended (but unsuitable) initilization value for the running average range register during resume. Signed-off-by: Andreas Herrmann <[email protected]> Tested-by: Andreas Hartmann <[email protected]> Signed-off-by: Jean Delvare <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> hwmon: (ads7871) Add 'name' sysfs attribute commit 4e21f4eaa49f78d3e977e316514c941053871c76 upstream. The 'name' sysfs attribute is mandatory for hwmon devices, but was missing in this driver. Cc: Paul Thomas <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Acked-by: Jean Delvare <[email protected]> Acked-by: Paul Thomas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: mpt2sas: Fix for issue - Unable to boot from the drive connected to HBA commit 10cce6d8b5af0b32bc4254ae4a28423a74c0921c upstream. This patch checks whether HBA is SAS2008 B0 controller. if it is a SAS2008 B0 controller then it use IO-APIC interrupt instead of MSIX, as SAS2008 B0 controller doesn't support MSIX interrupts. [jejb: fix whitespace problems] Signed-off-by: Sreekanth Reddy <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload commit d6532207116307eb7ecbfa7b9e02c53230096a50 upstream. This patch fixes the following kernel panic invoked by uninitialized fields in the chip initialization for the 1G bnx2 iSCSI offload. One of the bits in the chip initialization is being used by the latest firmware to control overflow packets. When this control bit gets enabled erroneously, it would ultimately result in a bad packet placement which would cause the bnx2 driver to dereference a NULL ptr in the placement handler. This can happen under certain stress I/O environment under the Linux iSCSI offload operation. This change only affects Broadcom's 5709 chipset. Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP: [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5 Pid: 0, comm: swapper Tainted: G ---- 2.6.18-333.el5debug #2 RIP: 0010:[<ffffffff881f0e7d>] [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5 RSP: 0018:ffff8101b575bd50 EFLAGS: 00010216 RAX: 0000000000000005 RBX: ffff81007c5fb180 RCX: 0000000000000000 RDX: 0000000000000ffc RSI: 00000000817e8000 RDI: 0000000000000220 RBP: ffff81015bbd7ec0 R08: ffff8100817e9000 R09: 0000000000000000 R10: ffff81007c5fb180 R11: 00000000000000c8 R12: 000000007a25a010 R13: 0000000000000000 R14: 0000000000000005 R15: ffff810159f80558 FS: 0000000000000000(0000) GS:ffff8101afebc240(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006a0 Process swapper (pid: 0, threadinfo ffff8101b5754000, task ffff8101afebd820) Stack: 000000000000000b ffff810159f80000 0000000000000040 ffff810159f80520 ffff810159f80500 00cf00cf8008e84b ffffc200100939e0 ffff810009035b20 0000502900000000 000000be00000001 ffff8100817e7810 00d08101b575bea8 Call Trace: <IRQ> [<ffffffff8008e0d0>] show_schedstat+0x1c2/0x25b [<ffffffff881f1886>] :bnx2:bnx2_poll+0xf6/0x231 [<ffffffff8000c9b9>] net_rx_action+0xac/0x1b1 [<ffffffff800125a0>] __do_softirq+0x89/0x133 [<ffffffff8005e30c>] call_softirq+0x1c/0x28 [<ffffffff8006d5de>] do_softirq+0x2c/0x7d [<ffffffff8006d46e>] do_IRQ+0xee/0xf7 [<ffffffff8005d625>] ret_from_intr+0x0/0xa <EOI> [<ffffffff801a5780>] acpi_processor_idle_simple+0x1c5/0x341 [<ffffffff801a573d>] acpi_processor_idle_simple+0x182/0x341 [<ffffffff801a55bb>] acpi_processor_idle_simple+0x0/0x341 [<ffffffff80049560>] cpu_idle+0x95/0xb8 [<ffffffff80078b1c>] start_secondary+0x479/0x488 Signed-off-by: Eddie Wai <[email protected]> Reviewed-by: Mike Christie <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: hpsa: fix handling of protocol error commit 256d0eaac87da1e993190846064f339f4c7a63f5 upstream. If a command status of CMD_PROTOCOL_ERR is received, this information should be conveyed to the SCSI mid layer, not dropped on the floor. CMD_PROTOCOL_ERR may be received from the Smart Array for any commands destined for an external RAID controller such as a P2000, or commands destined for tape drives or CD/DVD-ROM drives, if for instance a cable is disconnected. This mostly affects multipath configurations, as disconnecting a cable on a non-multipath configuration is not going to do anything good regardless of whether CMD_PROTOCOL_ERR is handled correctly or not. Not handling CMD_PROTOCOL_ERR correctly in a multipath configaration involving external RAID controllers may cause data corruption, so this is quite a serious bug. This bug should not normally cause a problem for direct attached disk storage. Signed-off-by: Stephen M. Cameron <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> hpwdt: Fix kdump issue in hpwdt commit 308b135e4fcc00c80c07e0e04e7afa8edf78583c upstream. kdump can be interrupted by watchdog timer when the timer is left activated on the crash kernel. Changed the hpwdt driver to disable watchdog timer at boot-time. This assures that watchdog timer is disabled until /dev/watchdog is opened, and prevents watchdog timer to be left running on the crash kernel. Signed-off-by: Toshi Kani <[email protected]> Tested-by: Lisa Mitchell <[email protected]> Signed-off-by: Thomas Mingarelli <[email protected]> Signed-off-by: Wim Van Sebroeck <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: fix bad applied patch for arch/arm/Kconfig of stable 3.0.y tree. No upstream commit as this is a merge error in the 3.0 tree. ARM_ERRATA_764369 and PL310_ERRATA_769419 do not appear in config menu in stable 3.0.y tree. This is because backported patch for arm/arm/Kconfig applied wrong place. This patch solves it. Signed-off-by: Tetsuyuki Kobayashi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: arch/arm/Kconfig ARM: 7532/1: decompressor: reset SCTLR.TRE for VMSA ARMv7 cores commit e1e5b7e4251c7538ca08c2c5545b0c2fbd8a6635 upstream. This patch zeroes the SCTLR.TRE bit prior to setting the mapping as cacheable for ARMv7 cores in the decompressor, ensuring that the memory region attributes are obtained from the C and B bits, not from the page tables. Cc: Nicolas Pitre <[email protected]> Reviewed-by: Will Deacon <[email protected]> Signed-off-by: Matthew Leach <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tracing: Don't call page_to_pfn() if page is NULL commit 85f2a2ef1d0ab99523e0b947a2b723f5650ed6aa upstream. When allocating memory fails, page is NULL. page_to_pfn() will cause the kernel panicked if we don't use sparsemem vmemmap. Link: http://lkml.kernel.org/r/[email protected] Acked-by: Mel Gorman <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Andrew Morton <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Signed-off-by: Wen Congyang <[email protected]> Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Input: i8042 - disable mux on Toshiba C850D commit 8669cf6793bb38307a30fb6b9565ddc8840ebd3f upstream. On Toshiba Satellite C850D, the touchpad and the keyboard might randomly not work at boot. Preventing MUX mode activation solves this issue. Signed-off-by: Anisse Astier <[email protected]> Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> asix: Support DLink DUB-E100 H/W Ver C1 commit ed3770a9cd5764a575b83810ea679bbff2b03082 upstream. Signed-off-by: Søren Holm <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> can: ti_hecc: fix oops during rmmod commit ab04c8bd423edb03e2148350a091836c196107fc upstream. This patch fixes an oops which occurs when unloading the driver, while the network interface is still up. The problem is that first the io mapping is teared own, then the CAN device is unregistered, resulting in accessing the hardware's iomem: [ 172.744232] Unable to handle kernel paging request at virtual address c88b0040 [ 172.752441] pgd = c7be4000 [ 172.755645] [c88b0040] *pgd=87821811, *pte=00000000, *ppte=00000000 [ 172.762207] Internal error: Oops: 807 [#1] PREEMPT ARM [ 172.767517] Modules linked in: ti_hecc(-) can_dev [ 172.772430] CPU: 0 Not tainted (3.5.0alpha-00037-g3554cc0 #126) [ 172.778961] PC is at ti_hecc_close+0xb0/0x100 [ti_hecc] [ 172.784423] LR is at __dev_close_many+0x90/0xc0 [ 172.789123] pc : [<bf00c768>] lr : [<c033be58>] psr: 60000013 [ 172.789123] sp : c5c1de68 ip : 00040081 fp : 00000000 [ 172.801025] r10: 00000001 r9 : c5c1c000 r8 : 00100100 [ 172.806457] r7 : c5d0a48c r6 : c5d0a400 r5 : 00000000 r4 : c5d0a000 [ 172.813232] r3 : c88b0000 r2 : 00000001 r1 : c5d0a000 r0 : c5d0a000 [ 172.820037] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 172.827423] Control: 10c5387d Table: 87be4019 DAC: 00000015 [ 172.833404] Process rmmod (pid: 600, stack limit = 0xc5c1c2f0) [ 172.839447] Stack: (0xc5c1de68 to 0xc5c1e000) [ 172.843994] de60: bf00c6b8 c5c1dec8 c5d0a000 c5d0a000 00200200 c033be58 [ 172.852478] de80: c5c1de44 c5c1dec8 c5c1dec8 c033bf2c c5c1de90 c5c1de90 c5d0a084 c5c1de44 [ 172.860992] dea0: c5c1dec8 c033c098 c061d3dc c5d0a000 00000000 c05edf28 c05edb34 c000d724 [ 172.869476] dec0: 00000000 c033c2f8 c5d0a084 c5d0a084 00000000 c033c370 00000000 c5d0a000 [ 172.877990] dee0: c05edb00 c033c3b8 c5d0a000 bf00d3ac c05edb00 bf00d7c8 bf00d7c8 c02842dc [ 172.886474] df00: c02842c8 c0282f90 c5c1c000 c05edb00 bf00d7c8 c0283668 bf00d7c8 00000000 [ 172.894989] df20: c0611f98 befe2f80 c000d724 c0282d10 bf00d804 00000000 00000013 c0068a8c [ 172.903472] df40: c5c538e8 685f6974 00636365 c61571a8 c5cb9980 c61571a8 c6158a20 c00c9bc4 [ 172.911987] df60: 00000000 00000000 c5cb9980 00000000 c5cb9980 00000000 c7823680 00000006 [ 172.920471] df80: bf00d804 00000880 c5c1df8c 00000000 000d4267 befe2f80 00000001 b6d90068 [ 172.928985] dfa0: 00000081 c000d5a0 befe2f80 00000001 befe2f80 00000880 b6d90008 00000008 [ 172.937469] dfc0: befe2f80 00000001 b6d90068 00000081 00000001 00000000 befe2eac 00000000 [ 172.945983] dfe0: 00000000 befe2b18 00023ba4 b6e6addc 60000010 befe2f80 a8e00190 86d2d344 [ 172.954498] [<bf00c768>] (ti_hecc_close+0xb0/0x100 [ti_hecc]) from [<c033be58>] (__dev__registered_many+0xc0/0x2a0) [ 172.984161] [<c033c098>] (rollback_registered_many+0xc0/0x2a0) from [<c033c2f8>] (rollback_registered+0x20/0x30) [ 172.994750] [<c033c2f8>] (rollback_registered+0x20/0x30) from [<c033c370>] (unregister_netdevice_queue+0x68/0x98) [ 173.005401] [<c033c370>] (unregister_netdevice_queue+0x68/0x98) from [<c033c3b8>] (unregister_netdev+0x18/0x20) [ 173.015899] [<c033c3b8>] (unregister_netdev+0x18/0x20) from [<bf00d3ac>] (ti_hecc_remove+0x60/0x80 [ti_hecc]) [ 173.026245] [<bf00d3ac>] (ti_hecc_remove+0x60/0x80 [ti_hecc]) from [<c02842dc>] (platform_drv_remove+0x14/0x18) [ 173.036712] [<c02842dc>] (platform_drv_remove+0x14/0x18) from [<c0282f90>] (__device_release_driver+0x7c/0xbc) Tested-by: Jan Luebbe <[email protected]> Cc: Anant Gole <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> can: janz-ican3: fix support for older hardware revisions commit e21093ef6fb4cbecdf926102286dbe280ae965db upstream. The Revision 1.0 Janz CMOD-IO Carrier Board does not have support for the reset registers. To support older hardware, the code is changed to use the hardware reset register on the Janz VMOD-ICAN3 hardware itself. Signed-off-by: Ira W. Snyder <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> cfg80211: fix possible circular lock on reg_regdb_search() commit a85d0d7f3460b1a123b78e7f7e39bf72c37dfb78 upstream. When call_crda() is called we kick off a witch …

vfs: dcache: fix deadlock in tree traversal commit 8110e16d42d587997bcaee0c864179e6d93603fe upstream. IBM reported a deadlock in select_parent(). This was found to be caused by taking rename_lock when already locked when restarting the tree traversal. There are two cases when the traversal needs to be restarted: 1) concurrent d_move(); this can only happen when not already locked, since taking rename_lock protects against concurrent d_move(). 2) racing with final d_put() on child just at the moment of ascending to parent; rename_lock doesn't protect against this rare race, so it can happen when already locked. Because of case 2, we need to be able to handle restarting the traversal when rename_lock is already held. This patch fixes all three callers of try_to_ascend(). IBM reported that the deadlock is gone with this patch. [ I rewrote the patch to be smaller and just do the "goto again" if the lock was already held, but credit goes to Miklos for the real work. - Linus ] Signed-off-by: Miklos Szeredi <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> dm: handle requests beyond end of device instead of using BUG_ON commit ba1cbad93dd47223b1f3b8edd50dd9ef2abcb2ed upstream. The access beyond the end of device BUG_ON that was introduced to dm_request_fn via commit 29e4013 ("dm: implement REQ_FLUSH/FUA support for request-based dm") was an overly drastic (but simple) response to this situation. I have received a report that this BUG_ON was hit and now think it would be better to use dm_kill_unmapped_request() to fail the clone and original request with -EIO. map_request() will assign the valid target returned by dm_table_find_target to tio->ti. But when the target isn't valid tio->ti is never assigned (because map_request isn't called); so add a check for tio->ti != NULL to dm_done(). Reported-by: Mike Christie <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Jun'ichi Nomura <[email protected]> Signed-off-by: Alasdair G Kergon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: option: blacklist QMI interface on ZTE MF683 commit 160c9425ac52cb30502be2d9c5e848cec91bb115 upstream. Interface Evervolv#5 on ZTE MF683 is a QMI/wwan interface. Signed-off-by: Bjørn Mork <[email protected]> Cc: Shawn J. Goff <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: ftdi_sio: add TIAO USB Multi-Protocol Adapter (TUMPA) support commit 54575b05af36959dfb6a49a3e9ca0c2b456b7126 upstream. TIAO/DIYGADGET USB Multi-Protocol Adapter (TUMPA) is an FTDI FT2232H based device which provides an easily accessible JTAG, SPI, I2C, serial breakout. http://www.diygadget.com/tiao-usb-multi-protocol-adapter-jtag-spi-i2c-serial.html http://www.tiaowiki.com/w/TIAO_USB_Multi_Protocol_Adapter_User%27s_Manual FTDI FT2232H provides two serial channels (A and B), but on the TUMPA channel A is dedicated to JTAG/SPI while channel B can be used for UART/RS-232: use the ftdi_jtag_quirk to expose only channel B as a usb-serial interface to userspace. Signed-off-by: Antonio Ospite <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: qcaux: add Pantech vendor class match commit c638eb2872b3af079501e7ee44cbb8a5cce9b4b5 upstream. The three Pantech devices UML190 (106c:3716), UML290 (106c:3718) and P4200 (106c:3721) all use the same subclasses to identify vendor specific functions. Replace the existing device specific entries with generic vendor matching, adding support for the P4200. Signed-off-by: Bjørn Mork <[email protected]> Cc: Thomas Schäfer <[email protected]> Acked-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: speakup_soft: Fix reading of init string commit 40fe4f89671fb3c7ded94190fb267402a38b0261 upstream. softsynth_read() reads a character at a time from the init string; when it finds the null terminator it sets the initialized flag but then repeats the last character. Additionally, if the read() buffer is not big enough for the init string, the next read() will start reading from the beginning again. So the caller may never progress to reading anything else. Replace the simple initialized flag with the current position in the init string, carried over between calls. Switch to reading real data once this reaches the null terminator. (This assumes that the length of the init string can't change, which seems to be the case. Really, the string and position belong together in a per-file private struct.) Tested-by: Samuel Thibault <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: comedi: s626: don't dereference insn->data commit b655c2c4782ed3e2e71d2608154e295a3e860311 upstream. `s626_enc_insn_config()` is incorrectly dereferencing `insn->data` which is a pointer to user memory. It should be dereferencing the separate `data` parameter that points to a copy of the data in kernel memory. Signed-off-by: Ian Abbott <[email protected]> Reviewed-by: H Hartley Sweeten <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: comedi: jr3_pci: fix iomem dereference commit e1878957b4676a17cf398f7f5723b365e9a2ca48 upstream. Correct a direct dereference of I/O memory to use an appropriate I/O memory access function. Note that the pointer being dereferenced is not currently tagged with `__iomem` but I plan to correct that for 3.7. Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: comedi: don't dereference user memory for INSN_INTTRIG commit 5d06e3df280bd230e2eadc16372e62818c63e894 upstream. `parse_insn()` is dereferencing the user-space pointer `insn->data` directly when handling the `INSN_INTTRIG` comedi instruction. It shouldn't be using `insn->data` at all; it should be using the separate `data` pointer passed to the function. Fix it. Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> staging: comedi: fix memory leak for saved channel list commit c8cad4c89ee3b15935c532210ae6ebb5c0a2734d upstream. When `do_cmd_ioctl()` allocates memory for the kernel copy of a channel list, it frees any previously allocated channel list in `async->cmd.chanlist` and replaces it with the new one. However, if the device is ever removed (or "detached") the cleanup code in `cleanup_device()` in "drivers.c" does not free this memory so it is lost. A sensible place to free the kernel copy of the channel list is in `do_become_nonbusy()` as at that point the comedi asynchronous command associated with the channel list is no longer valid. Free the channel list in `do_become_nonbusy()` instead of `do_cmd_ioctl()` and clear the pointer to prevent it being freed more than once. Note that `cleanup_device()` could be called at an inappropriate time while the comedi device is open, but that's a separate bug not related to this this patch. Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Remove BUG_ON from n_tty_read() commit e9490e93c1978b6669f3e993caa3189be13ce459 upstream. Change the BUG_ON to WARN_ON and return in case of tty->read_buf==NULL. We want to track a couple of long standing reports of this but at the same time we can avoid killing the box. Signed-off-by: Stanislav Kozina <[email protected]> Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> TTY: ttyprintk, don't touch behind tty->write_buf commit ee8b593affdf893012e57f4c54a21984d1b0d92e upstream. If a user provides a buffer larger than a tty->write_buf chunk and passes '\r' at the end of the buffer, we touch an out-of-bound memory. Add a check there to prevent this. Signed-off-by: Jiri Slaby <[email protected]> Cc: Samo Pogacnik <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> serial: pl011: handle corruption at high clock speeds commit c5dd553b9fd069892c9e2de734f4f604e280fa7a upstream. This works around a few glitches in the ST version of the PL011 serial driver when using very high baud rates, as we do in the Ux500: 3, 3.25, 4 and 4.05 Mbps. Problem Observed/rootcause: When using high baud-rates, and the baudrate*8 is getting close to the provided clock frequency (so a division factor close to 1), when using bursts of characters (so they are abutted), then it seems as if there is not enough time to detect the beginning of the start-bit which is a timing reference for the entire character, and thus the sampling moment of character bits is moving towards the end of each bit, instead of the middle. Fix: Increase slightly the RX baud rate of the UART above the theoretical baudrate by 5%. This will definitely give more margin time to the UART_RX to correctly sample the data at the middle of the bit period. Also fix the ages old copy-paste error in the very stressed comment, it's referencing the registers used in the PL010 driver rather than the PL011 ones. Signed-off-by: Guillaume Jaunet <[email protected]> Signed-off-by: Christophe Arnal <[email protected]> Signed-off-by: Matthias Locher <[email protected]> Signed-off-by: Rajanikanth HV <[email protected]> Cc: Bibek Basu <[email protected]> Cc: Par-Gunnar Hjalmdahl <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> serial: set correct baud_base for EXSYS EX-41092 Dual 16950 commit 26e8220adb0aec43b7acafa0f1431760eee28522 upstream. Apparently the same card model has two IDs, so this patch complements the commit 39aced6 adding the missing one. Signed-off-by: Flavio Leitner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> b43legacy: Fix crash on unload when firmware not available commit 2d838bb608e2d1f6cb4280e76748cb812dc822e7 upstream. When b43legacy is loaded without the firmware being available, a following unload generates a kernel NULL pointer dereference BUG as follows: [ 214.330789] BUG: unable to handle kernel NULL pointer dereference at 0000004c [ 214.330997] IP: [<c104c395>] drain_workqueue+0x15/0x170 [ 214.331179] *pde = 00000000 [ 214.331311] Oops: 0000 [Evervolv#1] SMP [ 214.331471] Modules linked in: b43legacy(-) ssb pcmcia mac80211 cfg80211 af_packet mperf arc4 ppdev sr_mod cdrom sg shpchp yenta_socket pcmcia_rsrc pci_hotplug pcmcia_core battery parport_pc parport floppy container ac button edd autofs4 ohci_hcd ehci_hcd usbcore usb_common thermal processor scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh fan thermal_sys hwmon ata_generic pata_ali libata [last unloaded: cfg80211] [ 214.333421] Pid: 3639, comm: modprobe Not tainted 3.6.0-rc6-wl+ #163 Source Technology VIC 9921/ALI Based Notebook [ 214.333580] EIP: 0060:[<c104c395>] EFLAGS: 00010246 CPU: 0 [ 214.333687] EIP is at drain_workqueue+0x15/0x170 [ 214.333788] EAX: c162ac40 EBX: cdfb8360 ECX: 0000002a EDX: 00002a2a [ 214.333890] ESI: 00000000 EDI: 00000000 EBP: cd767e7c ESP: cd767e5c [ 214.333957] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 214.333957] CR0: 8005003b CR2: 0000004c CR3: 0c96a000 CR4: 00000090 [ 214.333957] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 214.333957] DR6: ffff0ff0 DR7: 00000400 [ 214.333957] Process modprobe (pid: 3639, ti=cd766000 task=cf802e90 task.ti=cd766000) [ 214.333957] Stack: [ 214.333957] 00000292 cd767e74 c12c5e09 00000296 00000296 cdfb8360 cdfb9220 00000000 [ 214.333957] cd767e90 c104c4fd cdfb8360 cdfb9220 cd682800 cd767ea4 d0c10184 cd682800 [ 214.333957] cd767ea4 cba31064 cd767eb8 d0867908 cba31064 d087e09c cd96f034 cd767ec4 [ 214.333957] Call Trace: [ 214.333957] [<c12c5e09>] ? skb_dequeue+0x49/0x60 [ 214.333957] [<c104c4fd>] destroy_workqueue+0xd/0x150 [ 214.333957] [<d0c10184>] ieee80211_unregister_hw+0xc4/0x100 [mac80211] [ 214.333957] [<d0867908>] b43legacy_remove+0x78/0x80 [b43legacy] [ 214.333957] [<d083654d>] ssb_device_remove+0x1d/0x30 [ssb] [ 214.333957] [<c126f15a>] __device_release_driver+0x5a/0xb0 [ 214.333957] [<c126fb07>] driver_detach+0x87/0x90 [ 214.333957] [<c126ef4c>] bus_remove_driver+0x6c/0xe0 [ 214.333957] [<c1270120>] driver_unregister+0x40/0x70 [ 214.333957] [<d083686b>] ssb_driver_unregister+0xb/0x10 [ssb] [ 214.333957] [<d087c488>] b43legacy_exit+0xd/0xf [b43legacy] [ 214.333957] [<c1089dde>] sys_delete_module+0x14e/0x2b0 [ 214.333957] [<c110a4a7>] ? vfs_write+0xf7/0x150 [ 214.333957] [<c1240050>] ? tty_write_lock+0x50/0x50 [ 214.333957] [<c110a6f8>] ? sys_write+0x38/0x70 [ 214.333957] [<c1397c55>] syscall_call+0x7/0xb [ 214.333957] Code: bc 27 00 00 00 00 a1 74 61 56 c1 55 89 e5 e8 a3 fc ff ff 5d c3 90 55 89 e5 57 56 89 c6 53 b8 40 ac 62 c1 83 ec 14 e8 bb b7 34 00 <8b> 46 4c 8d 50 01 85 c0 89 56 4c 75 03 83 0e 40 80 05 40 ac 62 [ 214.333957] EIP: [<c104c395>] drain_workqueue+0x15/0x170 SS:ESP 0068:cd767e5c [ 214.333957] CR2: 000000000000004c [ 214.341110] ---[ end trace c7e90ec026d875a6 ]---Index: wireless-testing/drivers/net/wireless/b43legacy/main.c The problem is fixed by making certain that the ucode pointer is not NULL before deregistering the driver in mac80211. Signed-off-by: Larry Finger <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> firmware: Add missing attributes to EFI variable attribute print out from sysfs commit 7083909023bbe29b3176e92d2d089def1aa7aa1e upstream. Some of the EFI variable attributes are missing from print out from /sys/firmware/efi/vars/*/attributes. This patch adds those in. It also updates code to use pre-defined constants for masking current value of attributes. Signed-off-by: Khalid Aziz <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Matthew Garrett <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xhci: Intel Panther Point BEI quirk. commit 80fab3b244a22e0ca539d2439bdda50e81e5666f upstream. When a device with an isochronous endpoint is behind a hub plugged into the Intel Panther Point xHCI host controller, and the driver submits multiple frames per URB, the xHCI driver will set the Block Event Interrupt (BEI) flag on all but the last TD for the URB. This causes the host controller to place an event on the event ring, but not send an interrupt. When the last TD for the URB completes, BEI is cleared, and we get an interrupt for the whole URB. However, under a Panther Point xHCI host controller, if the parent hub is unplugged when one or more events from transfers with BEI set are on the event ring, a port status change event is placed on the event ring, but no interrupt is generated. This means URBs stop completing, and the USB device disconnect is not noticed. Something like a USB headset will cause mplayer to hang when the device is disconnected. If another transfer is sent (such as running `sudo lsusb -v`), the next transfer event seems to "unstick" the event ring, the xHCI driver gets an interrupt, and the disconnect is reported to the USB core. The fix is not to use the BEI flag under the Panther Point xHCI host. This will impact power consumption and system responsiveness, because the xHCI driver will receive an interrupt for every frame in all isochronous URBs instead of once per URB. Intel chipset developers confirm that this bug will be hit if the BEI flag is used on any endpoint, not just ones that are behind a hub. This patch should be backported to kernels as old as 3.0, that contain the commit 69e848c "Intel xhci: Support EHCI/xHCI port switching." Signed-off-by: Sarah Sharp <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> n_gsm: added interlocking for gsm_data_lock for certain code paths commit 5e44708f75b0f8712da715d6babb0c21089b2317 upstream. There were some locking holes in the management of the MUX's message queue for 2 code paths: 1) gsmld_write_wakeup 2) receipt of CMD_FCON flow-control message In both cases gsm_data_kick is called w/o locking so it can collide with other other instances of gsm_data_kick (pulling messages tx_tail) or potentially other instances of __gsm_data_queu (adding messages to tx_head) Changed to take the tx_lock in these 2 cases Signed-off-by: Russ Gorby <[email protected]> Tested-by: Yin, Fengwei <[email protected]> Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> coredump: prevent double-free on an error path in core dumper commit f34f9d186df35e5c39163444c43b4fc6255e39c5 upstream. In !CORE_DUMP_USE_REGSET case, if elf_note_info_init fails to allocate memory for info->fields, it frees already allocated stuff and returns error to its caller, fill_note_info. Which in turn returns error to its caller, elf_core_dump. Which jumps to cleanup label and calls free_note_info, which will happily try to free all info->fields again. BOOM. This is the fix. Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Denys Vlasenko <[email protected]> Cc: Venu Byravarasu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Increase XHCI suspend timeout to 16ms commit a6e097dfdfd189b6929af6efa1d289af61858386 upstream. The Intel XHCI specification says that after clearing the run/stop bit the controller may take up to 16ms to halt. We've seen a device take 14ms, which with the current timeout of 10ms causes the kernel to abort the suspend. Increasing the timeout to the recommended value fixes the problem. This patch should be backported to kernels as old as 2.6.37, that contain the commit 5535b1d "USB: xHCI: PCI power management implementation". Signed-off-by: Michael Spang <[email protected]> Signed-off-by: Sarah Sharp <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> n_gsm: memory leak in uplink error path commit 88ed2a60610974443335c924d7cb8e5dcf9dbdc1 upstream. Uplink (TX) network data will go through gsm_dlci_data_output_framed there is a bug where if memory allocation fails, the skb which has already been pulled off the list will be lost. In addition TX skbs were being processed in LIFO order Fixed the memory leak, and changed to FIFO order processing Signed-off-by: Russ Gorby <[email protected]> Tested-by: Kappel, LaurentX <[email protected]> Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> UBI: fix autoresize handling in R/O mode commit abb3e01103eb4e2ea5c15e6fedbc74e08bd4cc2b upstream. Currently UBI fails in autoresize when it is in R/O mode (e.g., because the underlying MTD device is R/O). This patch fixes the issue - we just skip autoresize and print a warning. Reported-by: Pali Rohár <[email protected]> Signed-off-by: Artem Bityutskiy <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: ibmvscsi: Fix host config length field overflow commit 225c56960fcafeccc2b6304f96cd3f0dbf42a16a upstream. The length field in the host config packet is only 16-bit long, so passing it 0x10000 (64K which is our standard PAGE_SIZE) doesn't work and result in an empty config from the server. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Acked-by: Robert Jennings <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: hpsa: Use LUN reset instead of target reset commit 21e89afd325849eb38adccf382df16cc895911f9 upstream. It turns out Smart Array logical drives do not support target reset and when the target reset fails, the logical drive will be taken off line. Symptoms look like this: hpsa 0000:03:00.0: Abort request on C1:B0:T0:L0 hpsa 0000:03:00.0: resetting device 1:0:0:0 hpsa 0000:03:00.0: cp ffff880037c56000 is reported invalid (probably means target device no longer present) hpsa 0000:03:00.0: resetting device failed. sd 1:0:0:0: Device offlined - not ready after error recovery sd 1:0:0:0: rejecting I/O to offline device EXT3-fs error (device sdb1): read_block_bitmap: LUN reset is supported though, and is what we should be using. Target reset is also disruptive in shared SAS situations, for example, an external MSA1210m which does support target reset attached to Smart Arrays in multiple hosts -- a target reset from one host is disruptive to other hosts as all LUNs on the target will be reset and will abort all outstanding i/os back to all the attached hosts. So we should use LUN reset, not target reset. Tested this with Smart Array logical drives and with tape drives. Not sure how this bug survived since 2009, except it must be very rare for a Smart Array to require more than 30s to complete a request. Signed-off-by: Stephen M. Cameron <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> can: mscan-mpc5xxx: fix return value check in mpc512x_can_get_clock() commit f61bd0585dfc7d99db4936d7467de4ca8e2f7ea0 upstream. In case of error, the function clk_get() returns ERR_PTR() and never returns NULL pointer. The NULL test in the error handling should be replaced with IS_ERR(). dpatch engine is used to auto generated this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <[email protected]> Acked-by: Wolfgang Grandegger <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> IPoIB: Fix use-after-free of multicast object commit bea1e22df494a729978e7f2c54f7bda328f74bc3 upstream. Fix a crash in ipoib_mcast_join_task(). (with help from Or Gerlitz) Commit c8c2afe ("IPoIB: Use rtnl lock/unlock when changing device flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue, and hence the workqueue can't be flushed from the context of ipoib_stop(). In the current code, ipoib_stop() (which doesn't flush the workqueue) calls ipoib_mcast_dev_flush(), which goes and deletes all the multicast entries. This takes place without any synchronization with a possible running instance of ipoib_mcast_join_task() for the same ipoib device, leading to a crash due to NULL pointer dereference. Fix this by making sure that the workqueue is flushed before ipoib_mcast_dev_flush() is called. To make that possible, we move the RTNL-lock wrapped code to ipoib_mcast_join_finish(). Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Roland Dreier <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> IB/srp: Fix use-after-free in srp_reset_req() commit 9b796d06d5d1b1e85ae2316a283ea11dd739ef96 upstream. srp_free_req() uses the scsi_cmnd structure contents to unmap buffers, so we must invoke srp_free_req() before we release ownership of that structure. Signed-off-by: Bart Van Assche <[email protected]> Acked-by: David Dillow <[email protected]> Signed-off-by: Roland Dreier <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> IB/srp: Avoid having aborted requests hang commit d8536670916a685df116b5c2cb256573fd25e4e3 upstream. We need to call scsi_done() for commands after we abort them. Signed-off-by: Bart Van Assche <[email protected]> Acked-by: David Dillow <[email protected]> Signed-off-by: Roland Dreier <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> isci: fix isci_pci_probe() generates warning on efi failure path commit 6d70a74ffd616073a68ae0974d98819bfa8e6da6 upstream. The oem parameter image embedded in the efi variable is at an offset from the start of the variable. However, in the failure path we try to free the 'orom' pointer which is only valid when the paramaters are being read from the legacy option-rom space. Since failure to load the oem parameters is unlikely and we keep the memory around in the success case just defer all de-allocation to devm. Reported-by: Don Morris <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> x86/alternatives: Fix p6 nops on non-modular kernels commit cb09cad44f07044d9810f18f6f9a6a6f3771f979 upstream. Probably a leftover from the early days of self-patching, p6nops are marked __initconst_or_module, which causes them to be discarded in a non-modular kernel. If something later triggers patching, it will overwrite kernel code with garbage. Reported-by: Tomas Racek <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Cc: Michael Tokarev <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: [email protected] Cc: Anthony Liguori <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Alan Cox <[email protected]> Cc: Alan Cox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Cc: Ben Jencks <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> PCI: honor child buses add_size in hot plug configuration commit be768912a49b10b68e96fbd8fa3cab0adfbd3091 upstream. git commit c8adf9a "PCI: pre-allocate additional resources to devices only after successful allocation of essential resources." fails to take into consideration the optional-resources needed by children devices while calculating the optional-resource needed by the bridge. This can be a problem on some setup. For example, if a hotplug bridge has 8 children hotplug bridges, the bridge should have enough resources to accomodate the hotplug requirements for each of its children hotplug bridges. Currently this is not the case. This patch fixes the problem. Signed-off-by: Yinghai Lu <[email protected]> Reviewed-by: Ram Pai <[email protected]> Signed-off-by: Jesse Barnes <[email protected]> Cc: Andrew Worsley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: scsi_remove_target: fix softlockup regression on hot remove commit bc3f02a795d3b4faa99d37390174be2a75d091bd upstream. John reports: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202] [..] Call Trace: [<ffffffff8141782a>] scsi_remove_target+0xda/0x1f0 [<ffffffff81421de5>] sas_rphy_remove+0x55/0x60 [<ffffffff81421e01>] sas_rphy_delete+0x11/0x20 [<ffffffff81421e35>] sas_port_delete+0x25/0x160 [<ffffffff814549a3>] mptsas_del_end_device+0x183/0x270 ...introduced by commit 3b661a9 "[SCSI] fix hot unplug vs async scan race". Don't restart lookup of more stargets in the multi-target case, just arrange to traverse the list once, on the assumption that new targets are always added at the end. There is no guarantee that the target will change state in scsi_target_reap() so we can end up spinning if we restart. Acked-by: Jack Wang <[email protected]> LKML-Reference: <CAEhu1-6wq1YsNiscGMwP4ud0Q+MrViRzv=kcWCQSBNc8c68N5Q@mail.gmail.com> Reported-by: John Drescher <[email protected]> Tested-by: John Drescher <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: scsi_dh_alua: Enable STPG for unavailable ports commit e47f8976d8e573928824a06748f7bc82c58d747f upstream. A quote from SPC-4: "While in the unavailable primary target port asymmetric access state, the device server shall support those of the following commands that it supports while in the active/optimized state: [ ... ] d) SET TARGET PORT GROUPS; [ ... ]". Hence enable sending STPG to a target port group that is in the unavailable state. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Mike Christie <[email protected]> Acked-by: Hannes Reinecke <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Linux 3.0.45

mn10300: only add -mmem-funcs to KBUILD_CFLAGS if gcc supports it commit 9957423f035c2071f6d1c5d2f095cdafbeb25ad7 upstream. It seems the current (gcc 4.6.3) no longer provides this so make it conditional. As reported by Tony before, the mn10300 architecture cross-compiles with gcc-4.6.3 if -mmem-funcs is not added to KBUILD_CFLAGS. Reported-by: Tony Breeds <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: David Howells <[email protected]> Cc: Koichi Yasutake <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> kbuild: make: fix if_changed when command contains backslashes commit c353acba28fb3fa1fd05fd6b85a9fc7938330f9c upstream. The call if_changed mechanism does not work when the command contains backslashes. This basically is an issue with lzo and bzip2 compressed kernels. The compressed binaries do not contain the uncompressed image size, so these use size_append to append the size. This results in backslashes in the executed command. With this if_changed always detects a change in the command and rebuilds the compressed image even if nothing has changed. Fix this by escaping backslashes in make-cmd Signed-off-by: Sascha Hauer <[email protected]> Signed-off-by: Jan Luebbe <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Bernhard Walle <[email protected]> Cc: Michal Marek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> media: rc: ite-cir: Initialise ite_dev::rdev earlier commit 4b961180ef275035b1538317ffd0e21e80e63e77 upstream. ite_dev::rdev is currently initialised in ite_probe() after rc_register_device() returns. If a newly registered device is opened quickly enough, we may enable interrupts and try to use ite_dev::rdev before it has been initialised. Move it up to the earliest point we can, right after calling rc_allocate_device(). Reported-and-tested-by: YunQiang Su <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ACPI: run _OSC after ACPI_FULL_INITIALIZATION commit fc54ab72959edbf229b65ac74b2f122d799ca002 upstream. The _OSC method may exist in module level code, so it must be called after ACPI_FULL_INITIALIZATION On some new platforms with Zero-Power-Optical-Disk-Drive (ZPODD) support, this fix is necessary to save power. Signed-off-by: Lin Ming <[email protected]> Tested-by: Aaron Lu <[email protected]> Signed-off-by: Len Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> PCI: acpiphp: check whether _ADR evaluation succeeded commit dfb117b3e50c52c7b3416db4a4569224b8db80bb upstream. Check whether we evaluated _ADR successfully. Previously we ignored failure, so we would have used garbage data from the stack as the device and function number. We return AE_OK so that we ignore only this slot and continue looking for other slots. Found by Coverity (CID 113981). Signed-off-by: Bjorn Helgaas <[email protected]> [bwh: Backported to 2.6.32/3.0: adjust context] Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> lib/gcd.c: prevent possible div by 0 commit e96875677fb2b7cb739c5d7769824dff7260d31d upstream. Account for all properties when a and/or b are 0: gcd(0, 0) = 0 gcd(a, 0) = a gcd(0, b) = b Fixes no known problems in current kernels. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> kernel/sys.c: call disable_nonboot_cpus() in kernel_restart() commit f96972f2dc6365421cf2366ebd61ee4cf060c8d5 upstream. As kernel_power_off() calls disable_nonboot_cpus(), we may also want to have kernel_restart() call disable_nonboot_cpus(). Doing so can help machines that require boot cpu be the last alive cpu during reboot to survive with kernel restart. This fixes one reboot issue seen on imx6q (Cortex-A9 Quad). The machine requires that the restart routine be run on the primary cpu rather than secondary ones. Otherwise, the secondary core running the restart routine will fail to come to online after reboot. Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drivers/scsi/atp870u.c: fix bad use of udelay commit 0f6d93aa9d96cc9022b51bd10d462b03296be146 upstream. The ACARD driver calls udelay() with a value > 2000, which leads to to the following compilation error on ARM: ERROR: "__bad_udelay" [drivers/scsi/atp870u.ko] undefined! make[1]: *** [__modpost] Error 1 This is because udelay is defined on ARM, roughly speaking, as #define udelay(n) ((n) > 2000 ? __bad_udelay() : \ __const_udelay((n) * ((2199023U*HZ)>>11))) The argument to __const_udelay is the number of jiffies to wait divided by 4, but this does not work unless the multiplication does not overflow, and that is what the build error is designed to prevent. The intended behavior can be achieved by using mdelay to call udelay multiple times in a loop. [[email protected]: adding context] Signed-off-by: Martin Michlmayr <[email protected]> Signed-off-by: Jonathan Nieder <[email protected]> Cc: James Bottomley <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> workqueue: add missing smp_wmb() in process_one_work() commit 959d1af8cffc8fd38ed53e8be1cf4ab8782f9c00 upstream. WORK_STRUCT_PENDING is used to claim ownership of a work item and process_one_work() releases it before starting execution. When someone else grabs PENDING, all pre-release updates to the work item should be visible and all updates made by the new owner should happen afterwards. Grabbing PENDING uses test_and_set_bit() and thus has a full barrier; however, clearing doesn't have a matching wmb. Given the preceding spin_unlock and use of clear_bit, I don't believe this can be a problem on an actual machine and there hasn't been any related report but it still is theretically possible for clear_pending to permeate upwards and happen before work->entry update. Add an explicit smp_wmb() before work_clear_pending(). Signed-off-by: Tejun Heo <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm: Workaround incompatibility of ESN and async crypto [ Upstream commit 3b59df46a449ec9975146d71318c4777ad086744 ] ESN for esp is defined in RFC 4303. This RFC assumes that the sequence number counters are always up to date. However, this is not true if an async crypto algorithm is employed. If the sequence number counters are not up to date on sequence number check, we may incorrectly update the upper 32 bit of the sequence number. This leads to a DOS. We workaround this by comparing the upper sequence number, (used for authentication) with the upper sequence number computed after the async processing. We drop the packet if these numbers are different. To do this, we introduce a recheck function that does this check in the ESN case. Signed-off-by: Steffen Klassert <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: return error pointer instead of NULL [ Upstream commit 864745d291b5ba80ea0bd0edcbe67273de368836 ] When dump_one_state() returns an error, e.g. because of a too small buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL instead of an error pointer. But its callers expect an error pointer and therefore continue to operate on a NULL skbuff. This could lead to a privilege escalation (execution of user code in kernel context) if the attacker has CAP_NET_ADMIN and is able to map address 0. Signed-off-by: Mathias Krause <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: return error pointer instead of NULL #2 [ Upstream commit c25463722509fef0ed630b271576a8c9a70236f3 ] When dump_one_policy() returns an error, e.g. because of a too small buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns NULL instead of an error pointer. But its caller expects an error pointer and therefore continues to operate on a NULL skbuff. Signed-off-by: Mathias Krause <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm: fix a read lock imbalance in make_blackhole [ Upstream commit 433a19548061bb5457b6ab77ed7ea58ca6e43ddb ] if xfrm_policy_get_afinfo returns 0, it has already released the read lock, xfrm_policy_put_afinfo should not be called again. Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: fix info leak in copy_to_user_auth() [ Upstream commit 4c87308bdea31a7b4828a51f6156e6f721a1fcc9 ] copy_to_user_auth() fails to initialize the remainder of alg_name and therefore discloses up to 54 bytes of heap memory via netlink to userland. Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name with null bytes. Signed-off-by: Mathias Krause <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: fix info leak in copy_to_user_state() [ Upstream commit f778a636713a435d3a922c60b1622a91136560c1 ] The memory reserved to dump the xfrm state includes the padding bytes of struct xfrm_usersa_info added by the compiler for alignment (7 for amd64, 3 for i386). Add an explicit memset(0) before filling the buffer to avoid the info leak. Signed-off-by: Mathias Krause <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: fix info leak in copy_to_user_policy() [ Upstream commit 7b789836f434c87168eab067cfbed1ec4783dffd ] The memory reserved to dump the xfrm policy includes multiple padding bytes added by the compiler for alignment (padding bytes in struct xfrm_selector and struct xfrm_userpolicy_info). Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Signed-off-by: Mathias Krause <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: fix info leak in copy_to_user_tmpl() [ Upstream commit 1f86840f897717f86d523a13e99a447e6a5d2fa5 ] The memory used for the template copy is a local stack variable. As struct xfrm_user_tmpl contains multiple holes added by the compiler for alignment, not initializing the memory will lead to leaking stack bytes to userland. Add an explicit memset(0) to avoid the info leak. Initial version of the patch by Brad Spengler. Signed-off-by: Mathias Krause <[email protected]> Cc: Brad Spengler <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: don't copy esn replay window twice for new states [ Upstream commit e3ac104d41a97b42316915020ba228c505447d21 ] The ESN replay window was already fully initialized in xfrm_alloc_replay_state_esn(). No need to copy it again. Signed-off-by: Mathias Krause <[email protected]> Cc: Steffen Klassert <[email protected]> Acked-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xfrm_user: ensure user supplied esn replay window is valid [ Upstream commit ecd7918745234e423dd87fcc0c077da557909720 ] The current code fails to ensure that the netlink message actually contains as many bytes as the header indicates. If a user creates a new state or updates an existing one but does not supply the bytes for the whole ESN replay window, the kernel copies random heap bytes into the replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL netlink attribute. This leads to following issues: 1. The replay window has random bits set confusing the replay handling code later on. 2. A malicious user could use this flaw to leak up to ~3.5kB of heap memory when she has access to the XFRM netlink interface (requires CAP_NET_ADMIN). Known users of the ESN replay window are strongSwan and Steffen's iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter uses the interface with a bitmap supplied while the former does not. strongSwan is therefore prone to run into issue 1. To fix both issues without breaking existing userland allow using the XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a fully specified one. For the former case we initialize the in-kernel bitmap with zero, for the latter we copy the user supplied bitmap. For state updates the full bitmap must be supplied. To prevent overflows in the bitmap length calculation the maximum size of bmp_len is limited to 128 by this patch -- resulting in a maximum replay window of 4096 packets. This should be sufficient for all real life scenarios (RFC 4303 recommends a default replay window size of 64). Signed-off-by: Mathias Krause <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: Martin Willi <[email protected]> Cc: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: ethernet: davinci_cpdma: decrease the desc count when cleaning up the remaining packets [ Upstream commit ffb5ba90017505a19e238e986e6d33f09e4df765 ] chan->count is used by rx channel. If the desc count is not updated by the clean up loop in cpdma_chan_stop, the value written to the rxfree register in cpdma_chan_start will be incorrect. Signed-off-by: Tao Hou <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ixp4xx_hss: fix build failure due to missing linux/module.h inclusion [ Upstream commit 0b836ddde177bdd5790ade83772860940bd481ea ] Commit 36a1211970193ce215de50ed1e4e1272bc814df1 (netprio_cgroup.h: dont include module.h from other includes) made the following build error on ixp4xx_hss pop up: CC [M] drivers/net/wan/ixp4xx_hss.o drivers/net/wan/ixp4xx_hss.c:1412:20: error: expected ';', ',' or ')' before string constant drivers/net/wan/ixp4xx_hss.c:1413:25: error: expected ';', ',' or ')' before string constant drivers/net/wan/ixp4xx_hss.c:1414:21: error: expected ';', ',' or ')' before string constant drivers/net/wan/ixp4xx_hss.c:1415:19: error: expected ';', ',' or ')' before string constant make[8]: *** [drivers/net/wan/ixp4xx_hss.o] Error 1 This was previously hidden because ixp4xx_hss includes linux/hdlc.h which includes linux/netdevice.h which includes linux/netprio_cgroup.h which used to include linux/module.h. The real issue was actually present since the initial commit that added this driver since it uses macros from linux/module.h without including this file. Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netxen: check for root bus in netxen_mask_aer_correctable [ Upstream commit e4d1aa40e363ed3e0486aeeeb0d173f7f822737e ] Add a check if pdev->bus->self == NULL (root bus). When attaching a netxen NIC to a VM it can be on the root bus and the guest would crash in netxen_mask_aer_correctable() because of a NULL pointer dereference if CONFIG_PCIEAER is present. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net-sched: sch_cbq: avoid infinite loop [ Upstream commit bdfc87f7d1e253e0a61e2fc6a75ea9d76f7fc03a ] Its possible to setup a bad cbq configuration leading to an infinite loop in cbq_classify() DEV_OUT=eth0 ICMP="match ip protocol 1 0xff" U32="protocol ip u32" DST="match ip dst" tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \ bandwidth 100mbit tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \ rate 512kbit allot 1500 prio 5 bounded isolated tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \ $ICMP $DST 192.168.3.234 flowid 1: Reported-by: Denys Fedoryschenko <[email protected]> Tested-by: Denys Fedoryschenko <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> pkt_sched: fix virtual-start-time update in QFQ [ Upstream commit 71261956973ba9e0637848a5adb4a5819b4bae83 ] If the old timestamps of a class, say cl, are stale when the class becomes active, then QFQ may assign to cl a much higher start time than the maximum value allowed. This may happen when QFQ assigns to the start time of cl the finish time of a group whose classes are characterized by a higher value of the ratio max_class_pkt/weight_of_the_class with respect to that of cl. Inserting a class with a too high start time into the bucket list corrupts the data structure and may eventually lead to crashes. This patch limits the maximum start time assigned to a class. Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> sierra_net: Endianess bug fix. [ Upstream commit 2120c52da6fe741454a60644018ad2a6abd957ac ] I discovered I couldn't get sierra_net to work on a powerpc. Turns out the firmware attribute check assumes the system is little endian and hence fails because the attributes is a 16 bit value. Signed-off-by: Len Sorensen <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> 8021q: fix mac_len recomputation in vlan_untag() [ Upstream commit 5316cf9a5197eb80b2800e1acadde287924ca975 ] skb_reset_mac_len() relies on the value of the skb->network_header pointer, therefore we must wait for such pointer to be recalculated before computing the new mac_len value. Signed-off-by: Antonio Quartulli <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipv6: release reference of ip6_null_entry's dst entry in __ip6_del_rt [ Upstream commit 6825a26c2dc21eb4f8df9c06d3786ddec97cf53b ] as we hold dst_entry before we call __ip6_del_rt, so we should alse call dst_release not only return -ENOENT when the rt6_info is ip6_null_entry. and we already hold the dst entry, so I think it's safe to call dst_release out of the write-read lock. Signed-off-by: Gao feng <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero [ Upstream commit 15c041759bfcd9ab0a4e43f1c16e2644977d0467 ] If recv() syscall is called for a TCP socket so that - IOAT DMA is used - MSG_WAITALL flag is used - requested length is bigger than sk_rcvbuf - enough data has already arrived to bring rcv_wnd to zero then when tcp_recvmsg() gets to calling sk_wait_data(), receive window can be still zero while sk_async_wait_queue exhausts enough space to keep it zero. As this queue isn't cleaned until the tcp_service_net_dma() call, sk_wait_data() cannot receive any data and blocks forever. If zero receive window and non-empty sk_async_wait_queue is detected before calling sk_wait_data(), process the queue first. Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> sctp: Don't charge for data in sndbuf again when transmitting packet [ Upstream commit 4c3a5bdae293f75cdf729c6c00124e8489af2276 ] SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up by __sctp_write_space(). Buffer space charged via sctp_set_owner_w() is released in sctp_wfree() which calls __sctp_write_space() directly. Buffer space charged via skb_set_owner_w() is released via sock_wfree() which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set. sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets. Therefore if sctp_packet_transmit() manages to queue up more than sndbuf bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is interrupted by a signal. This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ... Charging for the data twice does not make sense in the first place, it leads to overcharging sndbuf by a factor 2. Therefore this patch only charges a single byte in wmem_alloc when transmitting an SCTP packet to ensure that the socket stays alive until the packet has been released. This means that control chunks are no longer accounted for in wmem_alloc which I believe is not a problem as skb->truesize will typically lead to overcharging anyway and thus compensates for any control overhead. Signed-off-by: Thomas Graf <[email protected]> CC: Vlad Yasevich <[email protected]> CC: Neil Horman <[email protected]> CC: David Miller <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> pppoe: drop PPPOX_ZOMBIEs in pppoe_release [ Upstream commit 2b018d57ff18e5405823e5cb59651a5b4d946d7b ] When PPPOE is running over a virtual ethernet interface (e.g., a bonding interface) and the user tries to delete the interface in case the PPPOE state is ZOMBIE, the kernel will loop forever while unregistering net_device for the reference count is not decreased to zero which should have been done with dev_put(). Signed-off-by: Xiaodong Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: small bug on rxhash calculation [ Upstream commit 6862234238e84648c305526af2edd98badcad1e0 ] In the current rxhash calculation function, while the sorting of the ports/addrs is coherent (you get the same rxhash for packets sharing the same 4-tuple, in both directions), ports and addrs are sorted independently. This implies packets from a connection between the same addresses but crossed ports hash to the same rxhash. For example, traffic between A=S:l and B=L:s is hashed (in both directions) from {L, S, {s, l}}. The same rxhash is obtained for packets between C=S:s and D=L:l. This patch ensures that you either swap both addrs and ports, or you swap none. Traffic between A and B, and traffic between C and D, get their rxhash from different sources ({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D) The patch is co-written with Eric Dumazet <[email protected]> Signed-off-by: Chema Gonzalez <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: guard tcp_set_keepalive() to tcp sockets [ Upstream commit 3e10986d1d698140747fcfc2761ec9cb64c1d582 ] Its possible to use RAW sockets to get a crash in tcp_set_keepalive() / sk_reset_timer() Fix is to make sure socket is a SOCK_STREAM one. Reported-by: Dave Jones <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipv4: raw: fix icmp_filter() [ Upstream commit ab43ed8b7490cb387782423ecf74aeee7237e591 ] icmp_filter() should not modify its input, or else its caller would need to recompute ip_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipv6: raw: fix icmpv6_filter() [ Upstream commit 1b05c4b50edbddbdde715c4a7350629819f6655e ] icmpv6_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Also, if icmpv6 header cannot be found, do not deliver the packet, as we do in IPv4. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipv6: mip6: fix mip6_mh_filter() [ Upstream commit 96af69ea2a83d292238bdba20e4508ee967cf8cb ] mip6_mh_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> l2tp: fix a typo in l2tp_eth_dev_recv() [ Upstream commit c0cc88a7627c333de50b07b7c60b1d49d9d2e6cc ] While investigating l2tp bug, I hit a bug in eth_type_trans(), because not enough bytes were pulled in skb head. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netrom: copy_datagram_iovec can fail [ Upstream commit 6cf5c951175abcec4da470c50565cc0afe6cd11d ] Check for an error from this and if so bail properly. Signed-off-by: Alan Cox <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: do not disable sg for packets requiring no checksum [ Upstream commit c0d680e577ff171e7b37dbdb1b1bf5451e851f04 ] A change in a series of VLAN-related changes appears to have inadvertently disabled the use of the scatter gather feature of network cards for transmission of non-IP ethernet protocols like ATA over Ethernet (AoE). Below is a reference to the commit that introduces a "harmonize_features" function that turns off scatter gather when the NIC does not support hardware checksumming for the ethernet protocol of an sk buff. commit f01a5236bd4b140198fbcc550f085e8361fd73fa Author: Jesse Gross <[email protected]> Date: Sun Jan 9 06:23:31 2011 +0000 net offloading: Generalize netif_get_vlan_features(). The can_checksum_protocol function is not equipped to consider a protocol that does not require checksumming. Calling it for a protocol that requires no checksum is inappropriate. The patch below has harmonize_features call can_checksum_protocol when the protocol needs a checksum, so that the network layer is not forced to perform unnecessary skb linearization on the transmission of AoE packets. Unnecessary linearization results in decreased performance and increased memory pressure, as reported here: http://www.spinics.net/lists/linux-mm/msg15184.html The problem has probably not been widely experienced yet, because only recently has the kernel.org-distributed aoe driver acquired the ability to use payloads of over a page in size, with the patchset recently included in the mm tree: https://lkml.org/lkml/2012/8/28/140 The coraid.com-distributed aoe driver already could use payloads of greater than a page in size, but its users generally do not use the newest kernels. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> aoe: assert AoE packets marked as requiring no checksum [ Upstream commit 8babe8cc6570ed896b7b596337eb8fe730c3ff45 ] In order for the network layer to see that AoE requires no checksumming in a generic way, the packets must be marked as requiring no checksum, so we make this requirement explicit with the assertion. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tg3: Fix TSO CAP for 5704 devs w / ASF enabled [ Upstream commit cf9ecf4b631f649a964fa611f1a5e8874f2a76db ] On the earliest TSO capable devices, TSO was accomplished through firmware. The TSO cannot coexist with ASF management firmware though. The tg3 driver determines whether or not ASF is enabled by calling tg3_get_eeprom_hw_cfg(), which checks a particular bit of NIC memory. Commit dabc5c670d3f86d15ee4f42ab38ec5bd2682487d, entitled "tg3: Move TSO_CAPABLE assignment", accidentally moved the code that determines TSO capabilities earlier than the call to tg3_get_eeprom_hw_cfg(). As a consequence, the driver was attempting to determine TSO capabilities before it had all the data it needed to make the decision. This patch fixes the problem by revisiting and reevaluating the decision after tg3_get_eeprom_hw_cfg() is called. Signed-off-by: Matt Carlson <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: zfcp: Make trace record tags unique commit 0100998dbfe6dfcd90a6e912ca7ed6f255d48f25 upstream. Duplicate fssrh_2 from a54ca0f62f953898b05549391ac2a8a4dad6482b "[SCSI] zfcp: Redesign of the debug tracing for HBA records." complicates distinction of generic status read response from local link up. Duplicate fsscth1 from 2c55b750a884b86dea8b4cc5f15e1484cc47a25c "[SCSI] zfcp: Redesign of the debug tracing for SAN records." complicates distinction of good common transport response from invalid port handle. Signed-off-by: Steffen Maier <[email protected]> Reviewed-by: Martin Peschke <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: zfcp: Do not wakeup while suspended commit cb45214960bc989af8b911ebd77da541c797717d upstream. If the mapping of FCP device bus ID and corresponding subchannel is modified while the Linux image is suspended, the resume of FCP devices can fail. During resume, zfcp gets callbacks from cio regarding the modified subchannels but they can be arbitrarily mixed with the restore/resume callback. Since the cio callbacks would trigger adapter recovery, zfcp could wakeup before the resume callback. Therefore, ignore the cio callbacks regarding subchannels while being suspended. We can safely do so, since zfcp does not deal itself with subchannels. For problem determination purposes, we still trace the ignored callback events. The following kernel messages could be seen on resume: kernel: <WWPN>: parent <FCP device bus ID> should not be sleeping As part of adapter reopen recovery, zfcp performs auto port scanning which can erroneously try to register new remote ports with scsi_transport_fc and the device core code complains about the parent (adapter) still sleeping. kernel: zfcp.3dff9c: <FCP device bus ID>:\ Setting up the QDIO connection to the FCP adapter failed <last kernel message repeated 3 more times> kernel: zfcp.574d43: <FCP device bus ID>:\ ERP cannot recover an error on the FCP device In such cases, the adapter gave up recovery and remained blocked along with its child objects: remote ports and LUNs/scsi devices. Even the adapter shutdown as part of giving up recovery failed because the ccw device state remained disconnected. Later, the corresponding remote ports ran into dev_loss_tmo. As a result, the LUNs were erroneously not available again after resume. Even a manually triggered adapter recovery (e.g. sysfs attribute failed, or device offline/online via sysfs) could not recover the adapter due to the remaining disconnected state of the corresponding ccw device. Signed-off-by: Steffen Maier <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: zfcp: remove invalid reference to list iterator variable commit ca579c9f136af4274ccfd1bcaee7f38a29a0e2e9 upstream. If list_for_each_entry, etc complete a traversal of the list, the iterator variable ends up pointing to an address at an offset from the list head, and not a meaningful structure. Thus this value should not be used after the end of the iterator. Replace port->adapter->scsi_host by adapter->scsi_host. This problem was found using Coccinelle (http://coccinelle.lip6.fr/). Oversight in upsteam commit of v2.6.37 a1ca48319a9aa1c5b57ce142f538e76050bb8972 "[SCSI] zfcp: Move ACL/CFDC code to zfcp_cfdc.c" which merged the content of zfcp_erp_port_access_changed(). Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Steffen Maier <[email protected]> Reviewed-by: Martin Peschke <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: zfcp: restore refcount check on port_remove commit d99b601b63386f3395dc26a699ae703a273d9982 upstream. Upstream commit f3450c7b917201bb49d67032e9f60d5125675d6a "[SCSI] zfcp: Replace local reference counting with common kref" accidentally dropped a reference count check before tearing down zfcp_ports that are potentially in use by zfcp_units. Even remote ports in use can be removed causing unreachable garbage objects zfcp_ports with zfcp_units. Thus units won't come back even after a manual port_rescan. The kref of zfcp_port->dev.kobj is already used by the driver core. We cannot re-use it to track the number of zfcp_units. Re-introduce our own counter for units per port and check on port_remove. Signed-off-by: Steffen Maier <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SCSI: zfcp: only access zfcp_scsi_dev for valid scsi_device commit d436de8ce25f53a8a880a931886821f632247943 upstream. __scsi_remove_device (e.g. due to dev_loss_tmo) calls zfcp_scsi_slave_destroy which in turn sends a close LUN FSF request to the adapter. After 30 seconds without response, zfcp_erp_timeout_handler kicks the ERP thread failing the close LUN ERP action. zfcp_erp_wait in zfcp_erp_lun_shutdown_wait and thus zfcp_scsi_slave_destroy returns and then scsi_device is no longer valid. Sometime later the response to the close LUN FSF request may finally come in. However, commit b62a8d9b45b971a67a0f8413338c230e3117dff5 "[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit" introduced a number of attempts to unconditionally access struct zfcp_scsi_dev through struct scsi_device causing a use-after-free. This leads to an Oops due to kernel page fault in one of: zfcp_fsf_abort_fcp_command_handler, zfcp_fsf_open_lun_handler, zfcp_fsf_close_lun_handler, zfcp_fsf_req_trace, zfcp_fsf_fcp_handler_common. Move dereferencing of zfcp private data zfcp_scsi_dev allocated in scsi_device via scsi_transport_reserve_device after the check for potentially aborted FSF request and thus no longer valid scsi_device. Only then assign sdev_to_zfcp(sdev) to the local auto variable struct zfcp_scsi_dev *zfcp_sdev. Signed-off-by: Martin Peschke <[email protected]> Signed-off-by: Steffen Maier <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> PCI: Check P2P bridge for invalid secondary/subordinate range commit 1965f66e7db08d1ebccd24a59043eba826cc1ce8 upstream. For bridges with "secondary > subordinate", i.e., invalid bus number apertures, we don't enumerate anything behind the bridge unless the user specified "pci=assign-busses". This patch makes us automatically try to reassign the downstream bus numbers in this case (just for that bridge, not for all bridges as "pci=assign-busses" does). We don't discover all the devices on the Intel DP43BF motherboard without this change (or "pci=assign-busses") because its BIOS configures a bridge as: pci 0000:00:1e.0: PCI bridge to [bus 20-08] (subtractive decode) [bhelgaas: changelog, change message to dev_info] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=18412 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=625754 Reported-by: Brian C. Huffman <[email protected]> Reported-by: VL <[email protected]> Tested-by: VL <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> ext4: online defrag is not supported for journaled files commit f066055a3449f0e5b0ae4f3ceab4445bead47638 upstream. Proper block swap for inodes with full journaling enabled is truly non obvious task. In order to be on a safe side let's explicitly disable it for now. Signed-off-by: Dmitry Monakhov <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ext4: always set i_op in ext4_mknod() commit 6a08f447facb4f9e29fcc30fb68060bb5a0d21c2 upstream. ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR to mask those methods. And ext4_iget also always sets it, so there is an inconsistency. Signed-off-by: Bernd Schubert <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ext4: fix fdatasync() for files with only i_size changes commit b71fc079b5d8f42b2a52743c8d2f1d35d655b1c5 upstream. Code tracking when transaction needs to be committed on fdatasync(2) forgets to handle a situation when only inode's i_size is changed. Thus in such situations fdatasync(2) doesn't force transaction with new i_size to disk and that can result in wrong i_size after a crash. Fix the issue by updating inode's i_datasync_tid whenever its size is updated. Reported-by: Kristian Nielsen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ASoC: wm9712: Fix name of Capture Switch commit 689185b78ba6fbe0042f662a468b5565909dff7a upstream. Help UIs associate it with the matching gain control. Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm: fix invalidate_complete_page2() lock ordering commit ec4d9f626d5908b6052c2973f37992f1db52e967 upstream. In fuzzing with trinity, lockdep protested "possible irq lock inversion dependency detected" when isolate_lru_page() reenabled interrupts while still holding the supposedly irq-safe tree_lock: invalidate_inode_pages2 invalidate_complete_page2 spin_lock_irq(&mapping->tree_lock) clear_page_mlock isolate_lru_page spin_unlock_irq(&zone->lru_lock) isolate_lru_page() is correct to enable interrupts unconditionally: invalidate_complete_page2() is incorrect to call clear_page_mlock() while holding tree_lock, which is supposed to nest inside lru_lock. Both truncate_complete_page() and invalidate_complete_page() call clear_page_mlock() before taking tree_lock to remove page from radix_tree. I guess invalidate_complete_page2() preferred to test PageDirty (again) under tree_lock before committing to the munlock; but since the page has already been unmapped, its state is already somewhat inconsistent, and no worse if clear_page_mlock() moved up. Reported-by: Sasha Levin <[email protected]> Deciphered-by: Andrew Morton <[email protected]> Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Ying Han <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mm: thp: fix pmd_present for split_huge_page and PROT_NONE with THP commit 027ef6c87853b0a9df53175063028edb4950d476 upstream. In many places !pmd_present has been converted to pmd_none. For pmds that's equivalent and pmd_none is quicker so using pmd_none is better. However (unless we delete pmd_present) we should provide an accurate pmd_present too. This will avoid the risk of code thinking the pmd is non present because it's under __split_huge_page_map, see the pmd_mknotpresent there and the comment above it. If the page has been mprotected as PROT_NONE, it would also lead to a pmd_present false negative in the same way as the race with split_huge_page. Because the PSE bit stays on at all times (both during split_huge_page and when the _PAGE_PROTNONE bit get set), we could only check for the PSE bit, but checking the PROTNONE bit too is still good to remember pmd_present must always keep PROT_NONE into account. This explains a not reproducible BUG_ON that was seldom reported on the lists. The same issue is in pmd_large, it would go wrong with both PROT_NONE and if it races with split_huge_page. Signed-off-by: Andrea Arcangeli <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: aloop - add locking to timer access commit d4f1e48bd11e3df6a26811f7a1f06c4225d92f7d upstream. When the loopback timer handler is running, calling del_timer() (for STOP trigger) will not wait for the handler to complete before deactivating the timer. The timer gets rescheduled in the handler as usual. Then a subsequent START trigger will try to start the timer using add_timer() with a timer pending leading to a kernel panic. Serialize the calls to add_timer() and del_timer() using a spin lock to avoid this. Signed-off-by: Omair Mohammed Abdullah <[email protected]> Signed-off-by: Vinod Koul <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: usb - disable broken hw volume for Tenx TP6911 commit c10514394ef9e8de93a4ad8c8904d71dcd82c122 upstream. While going through Ubuntu bugs, I discovered this patch being posted and a confirmation that the patch works as expected. Finding out how the hw volume really works would be preferrable to just disabling the broken one, but this would be better than nothing. Credit: sndfnsdfin (qawsnews) BugLink: https://bugs.launchpad.net/bugs/559939 Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: USB: Support for (original) Xbox Communicator commit c05fce586d4da2dfe0309bef3795a8586e967bc3 upstream. Added support for Xbox Communicator to USB quirks. Signed-off-by: Marko Friedemann <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: only adjust default clocks on NI GPUs commit 2e3b3b105ab3bb5b6a37198da4f193cd13781d13 upstream. SI asics store voltage information differently so we don't have a way to deal with it properly yet. Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: Add MSI quirk for gateway RS690 commit 3a6d59df80897cc87812b6826d70085905bed013 upstream. Fixes another system on: https://bugs.freedesktop.org/show_bug.cgi?id=37679 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: force MSIs on RS690 asics commit fb6ca6d154cdcd53e7f27f8dbba513830372699b upstream. There are so many quirks, lets just try and force this for all RS690s. See: https://bugs.freedesktop.org/show_bug.cgi?id=37679 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> rcu: Fix day-one dyntick-idle stall-warning bug commit a10d206ef1a83121ab7430cb196e0376a7145b22 upstream. Each grace period is supposed to have at least one callback waiting for that grace period to complete. However, if CONFIG_NO_HZ=n, an extra callback-free grace period is no big problem -- it will chew up a tiny bit of CPU time, but it will complete normally. In contrast, CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to sleep indefinitely, in turn indefinitely delaying completion of the callback-free grace period. Given that nothing is waiting on this grace period, this is also not a problem. That is, unless RCU CPU stall warnings are also enabled, as they are in recent kernels. In this case, if a CPU wakes up after at least one minute of inactivity, an RCU CPU stall warning will result. The reason that no one noticed until quite recently is that most systems have enough OS noise that they will never remain absolutely idle for a full minute. But there are some embedded systems with cut-down userspace configurations that consistently get into this situation. All this begs the question of exactly how a callback-free grace period gets started in the first place. This can happen due to the fact that CPUs do not necessarily agree on which grace period is in progress. If a CPU still believes that the grace period that just completed is still ongoing, it will believe that it has callbacks that need to wait for another grace period, never mind the fact that the grace period that they were waiting for just completed. This CPU can therefore erroneously decide to start a new grace period. Note that this can happen in TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock considerations mean that the CPU that detected the end of the grace period is not necessarily officially informed of this fact for some time. Once this CPU notices that the earlier grace period completed, it will invoke its callbacks. It then won't have any callbacks left. If no other CPU has any callbacks, we now have a callback-free grace period. This commit therefore makes CPUs check more carefully before starting a new grace period. This new check relies on an array of tail pointers into each CPU's list of callbacks. If the CPU is up to date on which grace periods have completed, it checks to see if any callbacks follow the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks follow the RCU_WAIT_TAIL segment. The reason that this works is that the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment as soon as the CPU is officially notified that the old grace period has ended. This change is to cpu_needs_another_gp(), which is called in a number of places. The only one that really matters is in rcu_start_gp(), where the root rcu_node structure's ->lock is held, which prevents any other CPU from starting or completing a grace period, so that the comparison that determines whether the CPU is missing the completion of a grace period is stable. Reported-by: Becky Bruce <[email protected]> Reported-by: Subodh Nijsure <[email protected]> Reported-by: Paul Walmsley <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Tested-by: Paul Walmsley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: fix wake on lan setting for non-8111E. commit d4ed95d796e5126bba51466dc07e287cebc8bd19 upstream. Only 8111E needs enable RxConfig bit 0 ~ 3 when suspending or shutdowning for wake on lan. Signed-off-by: Hayes Wang <[email protected]> Acked-by: Francois Romieu <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: don't enable rx when shutdown. commit aaa89c08d9ffa3739c93d65d98b73ec2aa2e93a5 upstream. Only 8111b needs to enable rx when shutdowning with WoL. Signed-off-by: Hayes Wang <[email protected]> Acked-by: Francois Romieu <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: remove erroneous processing of always set bit. commit e03f33af79f0772156e1a1a1e36bdddf8012b2e4 upstream. When set, RxFOVF (resp. RxBOVF) is always 1 (resp. 0). Signed-off-by: Francois Romieu <[email protected]> Cc: Hayes <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: jumbo fixes. commit d58d46b5d85139d18eb939aa7279c160bab70484 upstream. - fix features : jumbo frames and checksumming can not be used at the same time. - introduce hw_jumbo_{enable / disable} helpers. Their content has been creatively extracted from Realtek's own drivers. As an illustration, it would be nice to know how/if the MaxTxPacketSize register operates when the device can work with a 9k jumbo frame as its documentation (8168c) can not be applied beyond ~7k. - rtl_tx_performance_tweak is moved forward. No change. Signed-off-by: Francois Romieu <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: expand received packet length indication. commit deb9d93c89d311714a60809b28160e538e1cbb43 upstream. 8168d and above allow jumbo frames beyond 8k. Bump the received packet length check before enabling jumbo frames on these chipsets. Frame length indication covers bits 0..13 of the first Rx descriptor 32 bits for the 8169 and 8168. I only have authoritative documentation for the allowed use of the extra (13) bit with the 8169 and 8168c. Realtek's drivers use the same mask for the 816x and the fast ethernet only 810x. Signed-off-by: Francois Romieu <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: increase the delay parameter of pm_schedule_suspend commit 10953db8e1a278742ef7e64a3d1491802bcfa98b upstream The link down would occur when reseting PHY. And it would take about 2 ~ 5 seconds from link down to link up. If the delay of pm_schedule_suspend is not long enough, the device would enter runtime_suspend before link up. After link up, the device would wake up and reset PHY again. Then, you would find the driver keep in a loop of runtime_suspend and rumtime_resume. Signed-off-by: Hayes Wang <[email protected]> Acked-by: Francois Romieu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: Rx FIFO overflow fixes. commit 811fd3010cf512f2e23e6c4c912aad54516dc706 upstream. Realtek has specified that the post 8168c gigabit chips and the post 8105e fast ethernet chips recover automatically from a Rx FIFO overflow. The driver does not need to clear the RxFIFOOver bit of IntrStatus and it should rather avoid messing it. The implementation deserves some explanation: 1. events outside of the intr_event bit mask are now ignored. It enforces a no-processing policy for the events that either should not be there or should be ignored. 2. RxFIFOOver was already ignored in rtl_cfg_infos[RTL_CFG_1] for the whole 8168 line of chips with two exceptions: - RTL_GIGA_MAC_VER_22 since b5ba6d12bdac21bc0620a5089e0f24e362645efd ("use RxFIFO overflow workaround for 8168c chipset."). This one should now be correctly handled. - RTL_GIGA_MAC_VER_11 (8168b) which requires a different Rx FIFO overflow processing. Though it does not conform to Realtek suggestion above, the updated driver includes no change for RTL_GIGA_MAC_VER_12 and RTL_GIGA_MAC_VER_17. Both are 8168b. RTL_GIGA_MAC_VER_12 is common and a bit old so I'd rather wait for experimental evidence that the change suggested by Realtek really helps or does not hurt in unexpected ways. Removed case statements in rtl8169_interrupt are only 8168 relevant. 3. RxFIFOOver is masked for post 8105e 810x chips, namely the sole 8105e (RTL_GIGA_MAC_VER_30) itself. Signed-off-by: Francois Romieu <[email protected]> Cc: hayeswang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Reviewed-by: Jonathan Nieder <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: fix Config2 MSIEnable bit setting. commit 2ca6cf06d988fea21e812a86be79353352677c9c upstream. The MSIEnable bit is only available for the 8169. Avoid Config2 writes for the post-8169 8168 and 810x. Reported-by: Su Kang Yin <[email protected]> Signed-off-by: Francois Romieu <[email protected]> Cc: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: missing barriers. commit 1e874e041fc7c222cbd85b20c4406070be1f687a upstream. Signed-off-by: Francois Romieu <[email protected]> Cc: Hayes Wang <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: runtime resume before shutdown. commit 2a15cd2ff488a9fdb55e5e34060f499853b27c77 upstream. With runtime PM, if the ethernet cable is disconnected, the device is transitioned to D3 state to conserve energy. If the system is shutdown in this state, any register accesses in rtl_shutdown are dropped on the floor. As the device was programmed by .runtime_suspend() to wake on link changes, it is thus brought back up as soon as the link recovers. Resuming every suspended device through the driver core would slow things down and it is not clear how many devices really need it now. Original report and D0 transition patch by Sameer Nanda. Patch has been changed to comply with advices by Rafael J. Wysocki and the PM folks. Reported-by: Sameer Nanda <[email protected]> Signed-off-by: Francois Romieu <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Hayes Wang <[email protected]> Cc: Alan Stern <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: Config1 is read-only on 8168c and later. commit 851e60221926a53344b4227879858bef841b0477 upstream. Suggested by Hayes. Signed-off-by: Francois Romieu <[email protected]> Cc: Hayes Wang <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: 8168c and later require bit 0x20 to be set in Config2 for PME signaling. commit d387b427c973974dd619a33549c070ac5d0e089f upstream. The new 84xx stopped flying below the radars. Signed-off-by: Francois Romieu <[email protected]> Cc: Hayes Wang <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: fix unsigned int wraparound with TSO commit 477206a018f902895bfcd069dd820bfe94c187b1 upstream. The r8169 may get stuck or show bad behaviour after activating TSO : the net_device is not stopped when it has no more TX descriptors. This problem comes from TX_BUFS_AVAIL which may reach -1 when all transmit descriptors are in use. The patch simply tries to keep positive values. Tested with 8111d(onboard) on a D510MO, and with 8111e(onboard) on a Zotac 890GXITX. Signed-off-by: Julien Ducourthial <[email protected]> Acked-by: Francois Romieu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> r8169: call netif_napi_del at errpaths and at driver unload commit ad1be8d345416a794dea39761a374032aa471a76 upstream. When register_netdev fails, the init'ed NAPIs by netif_napi_add must be deleted with netif_napi_del, and also when driver unloads, it should delete the NAPI before unregistering netdevice using unregister_netdev. Signed-off-by: Devendra Naga <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" commit 8d34694c1abf29df1f3c7317936b7e3e2e308d9b upstream. Commit 05f144a0d5c2 ("mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages") removed vma->vm_policy updates code but it is the purpose of mbind_range(). Now, mbind_range() is virtually a no-op and while it does not allow memory corruption it is not the right fix. This patch is a revert. [[email protected]: Edited changelog] Signed-off-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Mel Gorman <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Josh Boyer <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mempolicy: remove mempolicy sharing commit 869833f2c5c6e4dd09a5378cfc665ffb4615e5d2 upstream. Dave Jones' system call fuzz testing tool "trinity" triggered the following bug error with slab debugging enabled ============================================================================= BUG numa_policy (Not tainted): Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154 __slab_alloc+0x3d3/0x445 kmem_cache_alloc+0x29d/0x2b0 mpol_new+0xa3/0x140 sys_mbind+0x142/0x620 system_call_fastpath+0x16/0x1b INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154 __slab_free+0x2e/0x1de kmem_cache_free+0x25a/0x260 __mpol_put+0x27/0x30 remove_vma+0x68/0x90 exit_mmap+0x118/0x140 mmput+0x73/0x110 exit_mm+0x108/0x130 do_exit+0x162/0xb90 do_group_exit+0x4f/0xc0 sys_exit_group+0x17/0x20 system_call_fastpath+0x16/0x1b INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x (null) flags=0x20000000004080 INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0 The problem is that the structure is being prematurely freed due to a reference count imbalance. In the following case mbind(addr, len) should replace the memory policies of both vma1 and vma2 and thus they will become to share the same mempolicy and the new mempolicy will have the MPOL_F_SHARED flag. +-------------------+-------------------+ | vma1 | vma2(shmem) | +-------------------+-------------------+ | | addr addr+len alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for maintaining the mempolicy reference count. The current rule is that get_vma_policy() only increments refcount for shmem VMA and mpol_conf_put() only decrements refcount if the policy has MPOL_F_SHARED. In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED! The reference count will be decreased even though was not increased whenever alloc_page_vma() is called. This has been broken since commit [52cd3b07: mempolicy: rework mempolicy Reference Counting] in 2008. There is another serious bug with the sharing of memory policies. Currently, mempolicy rebind logic (it is called from cpuset rebinding) ignores a refcount of mempolicy and override it forcibly. Thus, any mempolicy sharing may cause mempolicy corruption. The bug was introduced by commit [68860ec1: cpusets: automatic numa mempolicy rebinding]. Ideally, the shared policy handling would be rewritten to either properly handle COW of the policy structures or at least reference count MPOL_F_SHARED based exclusively on information within the policy. However, this patch takes the easier approach of disabling any policy sharing between VMAs. Each new range allocated with sp_alloc will allocate a new policy, set the reference count to 1 and drop the reference count of the old policy. This increases the memory footprint but is not expected to be a major problem as mbind() is unlikely to be used for fine-grained ranges. It is also inefficient because it means we allocate a new policy even in cases where mbind_range() could use the new_policy passed to it. However, it is more straight-forward and the change should be invisible to the user. [[email protected]: Edited changelog] Reported-by: Dave Jones <[email protected]> Cc: Christoph Lameter <[email protected]> Reviewed-by: Christoph Lameter <[email protected]> Signed-off-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Mel Gorman <[email protected]> Cc: Josh Boyer <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mempolicy: fix a race in shared_policy_replace() commit b22d127a39ddd10d93deee3d96e643657ad53a49 upstream. shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot be dereferenced if sp->lock is not held and 2) another thread can modify sp_node between spin_unlock for allocating a new sp node and next spin_lock. The bug was introduced before 2.6.12-rc2. Kosaki's original patch for this problem was to allocate an sp node and policy within shared_policy_replace and initialise it when the lock is reacquired. I was not keen on this approach because it partially duplicates sp_alloc(). As the paths were sp->lock is taken are not that performance critical this patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc(). [[email protected]: Original patch] Signed-off-by: Mel Gorman <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Reviewed-by: Christoph Lameter <[email protected]> Cc: Josh Boyer <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mempolicy: fix refcount leak in mpol_set_shared_policy() commit 63f74ca21f1fad36d075e063f06dcc6d39fe86b2 upstream. When shared_policy_replace() fails to allocate new->policy is not freed correctly by mpol_set_shared_policy(). The problem is that shared mempolicy code directly call kmem_cache_free() in multiple places where it is easy to make a mistake. This patch creates an sp_free wrapper function and uses it. The bug was introduced pre-git age (IOW, before 2.6.12-rc2). [[email protected]: Editted changelog] Signed-off-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Mel Gorman <[email protected]> Reviewed-by: Christoph Lameter <[email protected]> Cc: Josh Boyer <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream. Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier related damage v3") introduced a potential memory corruption. shmem_alloc_page() uses a pseudo vma and it has one significant unique combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED. get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED. Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the reference c…

This is a squashed commit. lockd: use rpc client's cl_nodename for id encoding commit 303a7ce92064c285a04c870f2dc0192fdb2968cb upstream. Taking hostname from uts namespace if not safe, because this cuold be performind during umount operation on child reaper death. And in this case current->nsproxy is NULL already. Signed-off-by: Stanislav Kinsbursky <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ACPI: EC: Make the GPE storm threshold a module parameter commit a520d52e99b14ba7db135e916348f12f2a6e09be upstream. The Linux EC driver includes a mechanism to detect GPE storms, and switch from interrupt-mode to polling mode. However, polling mode sometimes doesn't work, so the workaround is problematic. Also, different systems seem to need the threshold for detecting the GPE storm at different levels. ACPI_EC_STORM_THRESHOLD was initially 20 when it's created, and was changed to 8 in 2.6.28 commit 06cf7d3 "ACPI: EC: lower interrupt storm threshold" to fix kernel bug 11892 by forcing the laptop in that bug to work in polling mode. However in bug 45151, it works fine in interrupt mode if we lift the threshold back to 20. This patch makes the threshold a module parameter so that user has a flexible option to debug/workaround this issue. The default is unchanged. This is also a preparation patch to fix specific systems: https://bugzilla.kernel.org/show_bug.cgi?id=45151 Signed-off-by: Feng Tang <[email protected]> Signed-off-by: Len Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ACPI: EC: Add a quirk for CLEVO M720T/M730T laptop commit 67bfa9b60bd689601554526d144b21d529f78a09 upstream. By enlarging the GPE storm threshold back to 20, that laptop's EC works fine with interrupt mode instead of polling mode. https://bugzilla.kernel.org/show_bug.cgi?id=45151 Reported-and-Tested-by: Francesco <[email protected]> Signed-off-by: Feng Tang <[email protected]> Signed-off-by: Len Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> mips,kgdb: fix recursive page fault with CONFIG_KPROBES commit f0a996eeeda214f4293e234df33b29bec003b536 upstream. This fault was detected using the kgdb test suite on boot and it crashes recursively due to the fact that CONFIG_KPROBES on mips adds an extra die notifier in the page fault handler. The crash signature looks like this: kgdbts:RUN bad memory access test KGDB: re-enter exception: ALL breakpoints killed Call Trace: [<807b7548>] dump_stack+0x20/0x54 [<807b7548>] dump_stack+0x20/0x54 The fix for now is to have kgdb return immediately if the fault type is DIE_PAGE_FAULT and allow the kprobe code to decide what is supposed to happen. Signed-off-by: Jason Wessel <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking commit 35c2a7f4908d404c9124c2efc6ada4640ca4d5d5 upstream. Fuzzing with trinity oopsed on the 1st instruction of shmem_fh_to_dentry(), u64 inum = fid->raw[2]; which is unhelpfully reported as at the end of shmem_alloc_inode(): BUG: unable to handle kernel paging request at ffff880061cd3000 IP: [<ffffffff812190d0>] shmem_alloc_inode+0x40/0x40 Oops: 0000 [Evervolv#1] PREEMPT SMP DEBUG_PAGEALLOC Call Trace: [<ffffffff81488649>] ? exportfs_decode_fh+0x79/0x2d0 [<ffffffff812d77c3>] do_handle_open+0x163/0x2c0 [<ffffffff812d792c>] sys_open_by_handle_at+0xc/0x10 [<ffffffff83a5f3f8>] tracesys+0xe1/0xe6 Right, tmpfs is being stupid to access fid->raw[2] before validating that fh_len includes it: the buffer kmalloc'ed by do_sys_name_to_handle() may fall at the end of a page, and the next page not be present. But some other filesystems (ceph, gfs2, isofs, reiserfs, xfs) are being careless about fh_len too, in fh_to_dentry() and/or fh_to_parent(), and could oops in the same way: add the missing fh_len checks to those. Reported-by: Sasha Levin <[email protected]> Signed-off-by: Hugh Dickins <[email protected]> Cc: Al Viro <[email protected]> Cc: Sage Weil <[email protected]> Cc: Steven Whitehouse <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: 7541/1: Add ARM ERRATA 775420 workaround commit 7253b85cc62d6ff84143d96fe6cd54f73736f4d7 upstream. arm: Add ARM ERRATA 775420 workaround Workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum. In case a date cache maintenance operation aborts with MMU exception, it might cause the processor to deadlock. This workaround puts DSB before executing ISB if an abort may occur on cache maintenance. Based on work by Kouei Abe and feedback from Catalin Marinas. Signed-off-by: Kouei Abe <[email protected]> [ [email protected]: Changed to implementation suggested by [email protected] ] Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> firewire: cdev: fix user memory corruption (i386 userland on amd64 kernel) commit 790198f74c9d1b46b6a89504361b1a844670d050 upstream. Fix two bugs of the /dev/fw* character device concerning the FW_CDEV_IOC_GET_INFO ioctl with nonzero fw_cdev_get_info.bus_reset. (Practically all /dev/fw* clients issue this ioctl right after opening the device.) Both bugs are caused by sizeof(struct fw_cdev_event_bus_reset) being 36 without natural alignment and 40 with natural alignment. 1) Memory corruption, affecting i386 userland on amd64 kernel: Userland reserves a 36 bytes large buffer, kernel writes 40 bytes. This has been first found and reported against libraw1394 if compiled with gcc 4.7 which happens to order libraw1394's stack such that the bug became visible as data corruption. 2) Information leak, affecting all kernel architectures except i386: 4 bytes of random kernel stack data were leaked to userspace. Hence limit the respective copy_to_user() to the 32-bit aligned size of struct fw_cdev_event_bus_reset. Reported-by: Simon Kirby <[email protected]> Signed-off-by: Stefan Richter <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT commit a519fc7a70d1a918574bb826cc6905b87b482eb9 upstream. Instead of doing a shutdown() call, we need to do an actual close(). Ditto if/when the server is sending us junk RPC headers. Signed-off-by: Trond Myklebust <[email protected]> Tested-by: Simon Kirby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xen/bootup: allow {read|write}_cr8 pvops call. commit 1a7bbda5b1ab0e02622761305a32dc38735b90b2 upstream. We actually do not do anything about it. Just return a default value of zero and if the kernel tries to write anything but 0 we BUG_ON. This fixes the case when an user tries to suspend the machine and it blows up in save_processor_state b/c 'read_cr8' is set to NULL and we get: kernel BUG at /home/konrad/ssd/linux/arch/x86/include/asm/paravirt.h:100! invalid opcode: 0000 [Evervolv#1] SMP Pid: 2687, comm: init.late Tainted: G O 3.6.0upstream-00002-gac264ac-dirty Evervolv#4 Bochs Bochs RIP: e030:[<ffffffff814d5f42>] [<ffffffff814d5f42>] save_processor_state+0x212/0x270 .. snip.. Call Trace: [<ffffffff810733bf>] do_suspend_lowlevel+0xf/0xac [<ffffffff8107330c>] ? x86_acpi_suspend_lowlevel+0x10c/0x150 [<ffffffff81342ee2>] acpi_suspend_enter+0x57/0xd5 Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> xen/bootup: allow read_tscp call for Xen PV guests. commit cd0608e71e9757f4dae35bcfb4e88f4d1a03a8ab upstream. The hypervisor will trap it. However without this patch, we would crash as the .read_tscp is set to NULL. This patch fixes it and sets it to the native_read_tscp call. Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> block: fix request_queue->flags initialization commit 60ea8226cbd5c8301f9a39edc574ddabcb8150e0 upstream. A queue newly allocated with blk_alloc_queue_node() has only QUEUE_FLAG_BYPASS set. For request-based drivers, blk_init_allocated_queue() is called and q->queue_flags is overwritten with QUEUE_FLAG_DEFAULT which doesn't include BYPASS even though the initial bypass is still in effect. In blk_init_allocated_queue(), or QUEUE_FLAG_DEFAULT to q->queue_flags instead of overwriting. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Vivek Goyal <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> autofs4 - fix reset pending flag on mount fail commit 49999ab27eab6289a8e4f450e148bdab521361b2 upstream. In autofs4_d_automount(), if a mount fail occurs the AUTOFS_INF_PENDING mount pending flag is not cleared. One effect of this is when using the "browse" option, directory entry attributes show up with all "?"s due to the incorrect callback and subsequent failure return (when in fact no callback should be made). Signed-off-by: Ian Kent <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> module: taint kernel when lve module is loaded commit c99af3752bb52ba3aece5315279a57a477edfaf1 upstream. Cloudlinux have a product called lve that includes a kernel module. This was previously GPLed but is now under a proprietary license, but the module continues to declare MODULE_LICENSE("GPL") and makes use of some EXPORT_SYMBOL_GPL symbols. Forcibly taint it in order to avoid this. Signed-off-by: Matthew Garrett <[email protected]> Cc: Alex Lyashkov <[email protected]> Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> video/udlfb: fix line counting in fb_write commit b8c4321f3d194469007f5f5f2b34ec278c264a04 upstream. Line 0 and 1 were both written to line 0 (on the display) and all subsequent lines had an offset of -1. The result was that the last line on the display was never overwritten by writes to /dev/fbN. Signed-off-by: Alexander Holler <[email protected]> Acked-by: Bernie Thompson <[email protected]> Signed-off-by: Florian Tobias Schandinat <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> viafb: don't touch clock state on OLPC XO-1.5 commit 012a1211845eab69a5488d59eb87d24cc518c627 upstream. As detailed in the thread titled "viafb PLL/clock tweaking causes XO-1.5 instability," enabling or disabling the IGA1/IGA2 clocks causes occasional stability problems during suspend/resume cycles on this platform. This is rather odd, as the documentation suggests that clocks have two states (on/off) and the default (stable) configuration is configured to enable the clock only when it is needed. However, explicitly enabling *or* disabling the clock triggers this system instability, suggesting that there is a 3rd state at play here. Leaving the clock enable/disable registers alone solves this problem. This fixes spurious reboots during suspend/resume behaviour introduced by commit b692a63. Signed-off-by: Daniel Drake <[email protected]> Signed-off-by: Florian Tobias Schandinat <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> timers: Fix endless looping between cascade() and internal_add_timer() commit 26cff4e2aa4d666dc6a120ea34336b5057e3e187 upstream. Adding two (or more) timers with large values for "expires" (they have to reside within tv5 in the same list) leads to endless looping between cascade() and internal_add_timer() in case CONFIG_BASE_SMALL is one and jiffies are crossing the value 1 << 18. The bug was introduced between 2.6.11 and 2.6.12 (and survived for quite some time). This patch ensures that when cascade() is called timers within tv5 are not added endlessly to their own list again, instead they are added to the next lower tv level tv4 (as expected). Signed-off-by: Christian Hildner <[email protected]> Reviewed-by: Jan Kiszka <[email protected]> Link: http://lkml.kernel.org/r/98673C87CB31274881CFFE0B65ECC87B0F5FC1963E@DEFTHW99EA4MSX.ww902.siemens.net Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> pktgen: fix crash when generating IPv6 packets commit 5aa8b572007c4bca1e6d3dd4c4820f1ae49d6bb2 upstream. For IPv6, sizeof(struct ipv6hdr) = 40, thus the following expression will result negative: datalen = pkt_dev->cur_pkt_size - 14 - sizeof(struct ipv6hdr) - sizeof(struct udphdr) - pkt_dev->pkt_overhead; And, the check "if (datalen < sizeof(struct pktgen_hdr))" will be passed as "datalen" is promoted to unsigned, therefore will cause a crash later. This is a quick fix by checking if "datalen" is negative. The following patch will increase the default value of 'min_pkt_size' for IPv6. This bug should exist for a long time, so Cc -stable too. Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tg3: Apply short DMA frag workaround to 5906 commit b7abee6ef888117f92db370620ebf116a38e3f4d upstream. 5906 devices also need the short DMA fragment workaround. This patch makes the necessary change. Signed-off-by: Matt Carlson <[email protected]> Tested-by: Christian Kujau <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Mike Pagano <[email protected]> ipvs: fix oops in ip_vs_dst_event on rmmod commit 283283c4da91adc44b03519f434ee1e7e91d6fdb upstream. After commit 39f618b4fd95ae243d940ec64c961009c74e3333 (3.4) "ipvs: reset ipvs pointer in netns" we can oops in ip_vs_dst_event on rmmod ip_vs because ip_vs_control_cleanup is called after the ipvs_core_ops subsys is unregistered and net->ipvs is NULL. Fix it by exiting early from ip_vs_dst_event if ipvs is NULL. It is safe because all services and dests for the net are already freed. Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netfilter: nf_conntrack: fix racy timer handling with reliable events commit 5b423f6a40a0327f9d40bc8b97ce9be266f74368 upstream. Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netfilter: nf_ct_ipv4: packets with wrong ihl are invalid commit 07153c6ec074257ade76a461429b567cff2b3a1e upstream. It was reported that the Linux kernel sometimes logs: klogd: [2629147.402413] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 447! klogd: [1072212.887368] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 392 ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in nf_conntrack_proto_tcp.c should catch malformed packets, so the errors at the indicated lines - TCP options parsing - should not happen. However, tcp_error() relies on the "dataoff" offset to the TCP header, calculated by ipv4_get_l4proto(). But ipv4_get_l4proto() does not check bogus ihl values in IPv4 packets, which then can slip through tcp_error() and get caught at the TCP options parsing routines. The patch fixes ipv4_get_l4proto() by invalidating packets with bogus ihl value. The patch closes netfilter bugzilla id 771. Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation commit 3f509c689a07a4aa989b426893d8491a7ffcc410 upstream. We're hitting bug while trying to reinsert an already existing expectation: kernel BUG at kernel/timer.c:895! invalid opcode: 0000 [Evervolv#1] SMP [...] Call Trace: <IRQ> [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack] [<ffffffff812d423a>] ? in4_pton+0x72/0x131 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip] [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip] [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip] [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip] We have to remove the RTP expectation if the RTCP expectation hits EBUSY since we keep trying with other ports until we succeed. Reported-by: Rafal Fitt <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ipvs: fix oops on NAT reply in br_nf context commit 9e33ce453f8ac8452649802bee1f410319408f4b upstream. IPVS should not reset skb->nf_bridge in FORWARD hook by calling nf_reset for NAT replies. It triggers oops in br_nf_forward_finish. [ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112 [ 579.781792] PGD 218f9067 PUD 0 [ 579.781865] Oops: 0000 [Evervolv#1] SMP [ 579.781945] CPU 0 [ 579.781983] Modules linked in: [ 579.782047] [ 579.782080] [ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8 [ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112 [ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287 [ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a [ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00 [ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90 [ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02 [ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000 [ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70 [ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b [ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0 [ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760) [ 579.783919] Stack: [ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00 [ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7 [ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0 [ 579.784477] Call Trace: [ 579.784523] <IRQ> [ 579.784562] [ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8 [ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d [ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae [ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102 [ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae [ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b [ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2 [ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257 [ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb [ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1 [ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44 [ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102 [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44 [ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54 [ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229 [ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257 [ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1 [ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2 [ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242 [ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0 [ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c [ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30 The steps to reproduce as follow, 1. On Host1, setup brige br0(192.168.1.106) 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd 3. Start IPVS service on Host1 ipvsadm -A -t 192.168.1.106:80 -s rr ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m 4. Run apache benchmark on Host2(192.168.1.101) ab -n 1000 http://192.168.1.106/ ip_vs_reply4 ip_vs_out handle_response ip_vs_notrack nf_reset() { skb->nf_bridge = NULL; } Actually, IPVS wants in this case just to replace nfct with untracked version. So replace the nf_reset(skb) call in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call. Signed-off-by: Lin Ming <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netfilter: nf_nat_sip: fix via header translation with multiple parameters commit f22eb25cf5b1157b29ef88c793b71972efc47143 upstream. Via-headers are parsed beginning at the first character after the Via-address. When the address is translated first and its length decreases, the offset to start parsing at is incorrect and header parameters might be missed. Update the offset after translating the Via-address to fix this. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netfilter: nf_ct_expect: fix possible access to uninitialized timer commit 2614f86490122bf51eb7c12ec73927f1900f4e7d upstream. In __nf_ct_expect_check, the function refresh_timer returns 1 if a matching expectation is found and its timer is successfully refreshed. This results in nf_ct_expect_related returning 0. Note that at this point: - the passed expectation is not inserted in the expectation table and its timer was not initialized, since we have refreshed one matching/existing expectation. - nf_ct_expect_alloc uses kmem_cache_alloc, so the expectation timer is in some undefined state just after the allocation, until it is appropriately initialized. This can be a problem for the SIP helper during the expectation addition: ... if (nf_ct_expect_related(rtp_exp) == 0) { if (nf_ct_expect_related(rtcp_exp) != 0) nf_ct_unexpect_related(rtp_exp); ... Note that nf_ct_expect_related(rtp_exp) may return 0 for the timer refresh case that is detailed above. Then, if nf_ct_unexpect_related(rtcp_exp) returns != 0, nf_ct_unexpect_related(rtp_exp) is called, which does: spin_lock_bh(&nf_conntrack_lock); if (del_timer(&exp->timeout)) { nf_ct_unlink_expect(exp); nf_ct_expect_put(exp); } spin_unlock_bh(&nf_conntrack_lock); Note that del_timer always returns false if the timer has been initialized. However, the timer was not initialized since setup_timer was not called, therefore, the expectation timer remains in some undefined state. If I'm not missing anything, this may lead to the removal an unexistent expectation. To fix this, the optimization that allows refreshing an expectation is removed. Now nf_conntrack_expect_related looks more consistent to me since it always add the expectation in case that it returns success. Thanks to Patrick McHardy for participating in the discussion of this patch. I think this may be the source of the problem described by: http://marc.info/?l=netfilter-devel&m=134073514719421&w=2 Reported-by: Rafal Fitt <[email protected]> Acked-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Conflicts: net/netfilter/nf_conntrack_expect.c netfilter: limit, hashlimit: avoid duplicated inline commit 7a909ac70f6b0823d9f23a43f19598d4b57ac901 upstream. credit_cap can be set to credit, which avoids inlining user2credits twice. Also, remove inline keyword and let compiler decide. old: 684 192 0 876 36c net/netfilter/xt_limit.o 4927 344 32 5303 14b7 net/netfilter/xt_hashlimit.o now: 668 192 0 860 35c net/netfilter/xt_limit.o 4793 344 32 5169 1431 net/netfilter/xt_hashlimit.o Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> netfilter: xt_limit: have r->cost != 0 case work commit 82e6bfe2fbc4d48852114c4f979137cd5bf1d1a8 upstream. Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when a running state is saved to userspace and then reinstated from there. Make sure that private xt_limit area is initialized with correct values. Otherwise, random matchings due to use of uninitialized memory. Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Acked-by: David Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Add CDC-ACM support for the CX93010-2x UCMxx USB Modem commit e7d491a19d3e3aac544070293891a2542ae0c565 upstream. This USB V.92/V.32bis Controllered Modem have the USB vendor ID 0x0572 and device ID 0x1340. It need the NO_UNION_NORMAL quirk to be recognized. Reference: http://www.conexant.com/servlets/DownloadServlet/DSH-201723-005.pdf?docid=1725&revid=5 See idVendor and idProduct in table 6-1. Device Descriptors Signed-off-by: Jean-Christian de Rivaz <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drm/radeon: Don't destroy I2C Bus Rec in radeon_ext_tmds_enc_destroy(). commit 082918471139b07964967cfe5f70230909c82ae1 upstream. radeon_i2c_fini() walks thru the list of I2C bus recs rdev->i2c_bus[] to destroy each of them. radeon_ext_tmds_enc_destroy() however also has code to destroy it's associated I2C bus rec which has been obtained by radeon_i2c_lookup() and is therefore also in the i2c_bus[] list. This causes a double free resulting in a kernel panic when unloading the radeon driver. Removing destroy code from radeon_ext_tmds_enc_destroy() fixes this problem. agd5f: fix compiler warning Signed-off-by: Egbert Eich <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> jbd: Fix assertion failure in commit code due to lacking transaction credits commit 09e05d4805e6c524c1af74e524e5d0528bb3fef3 upstream. ext3 users of data=journal mode with blocksize < pagesize were occasionally hitting assertion failure in journal_commit_transaction() checking whether the transaction has at least as many credits reserved as buffers attached. The core of the problem is that when a file gets truncated, buffers that still need checkpointing or that are attached to the committing transaction are left with buffer_mapped set. When this happens to buffers beyond i_size attached to a page stradding i_size, subsequent write extending the file will see these buffers and as they are mapped (but underlying blocks were freed) things go awry from here. The assertion failure just coincidentally (and in this case luckily as we would start corrupting filesystem) triggers due to journal_head not being properly cleaned up as well. Under some rare circumstances this bug could even hit data=ordered mode users. There the assertion won't trigger and we would end up corrupting the filesystem. We fix the problem by unmapping buffers if possible (in lots of cases we just need a buffer attached to a transaction as a place holder but it must not be written out anyway). And in one case, we just have to bite the bullet and wait for transaction commit to finish. Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> x86, random: Architectural inlines to get random integers with RDRAND commit 628c6246d47b85f5357298601df2444d7f4dd3fd upstream. Architectural inlines to get random ints and longs using the RDRAND instruction. Intel has introduced a new RDRAND instruction, a Digital Random Number Generator (DRNG), which is functionally an high bandwidth entropy source, cryptographic whitener, and integrity monitor all built into hardware. This enables RDRAND to be used directly, bypassing the kernel random number pool. For technical documentation, see: http://software.intel.com/en-us/articles/download-the-latest-bull-mountain-software-implementation-guide/ In this patch, this is *only* used for the nonblocking random number pool. RDRAND is a nonblocking source, similar to our /dev/urandom, and is therefore not a direct replacement for /dev/random. The architectural hooks presented in the previous patch only feed the kernel internal users, which only use the nonblocking pool, and so this is not a problem. Since this instruction is available in userspace, there is no reason to have a /dev/hw_rng device driver for the purpose of feeding rngd. This is especially so since RDRAND is a nonblocking source, and needs additional whitening and reduction (see the above technical documentation for details) in order to be of "pure entropy source" quality. The CONFIG_EXPERT compile-time option can be used to disable this use of RDRAND. Signed-off-by: H. Peter Anvin <[email protected]> Originally-by: Fenghua Yu <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Herbert Xu <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> x86, random: Verify RDRAND functionality and allow it to be disabled commit 49d859d78c5aeb998b6936fcb5f288f78d713489 upstream. If the CPU declares that RDRAND is available, go through a guranteed reseed sequence, and make sure that it is actually working (producing data.) If it does not, disable the CPU feature flag. Allow RDRAND to be disabled on the command line (as opposed to at compile time) for a user who has special requirements with regards to random numbers. Signed-off-by: H. Peter Anvin <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Herbert Xu <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> tpm: Propagate error from tpm_transmit to fix a timeout hang commit abce9ac292e13da367bbd22c1f7669f988d931ac upstream. tpm_write calls tpm_transmit without checking the return value and assigns the return value unconditionally to chip->pending_data, even if it's an error value. This causes three bugs. So if we write to /dev/tpm0 with a tpm_param_size bigger than TPM_BUFSIZE=0x1000 (e.g. 0x100a) and a bufsize also bigger than TPM_BUFSIZE (e.g. 0x100a) tpm_transmit returns -E2BIG which is assigned to chip->pending_data as -7, but tpm_write returns that TPM_BUFSIZE bytes have been successfully been written to the TPM, altough this is not true (bug Evervolv#1). As we did write more than than TPM_BUFSIZE bytes but tpm_write reports that only TPM_BUFSIZE bytes have been written the vfs tries to write the remaining bytes (in this case 10 bytes) to the tpm device driver via tpm_write which then blocks at /* cannot perform a write until the read has cleared either via tpm_read or a user_read_timer timeout */ while (atomic_read(&chip->data_pending) != 0) msleep(TPM_TIMEOUT); for 60 seconds, since data_pending is -7 and nobody is able to read it (since tpm_read luckily checks if data_pending is greater than 0) (#bug 2). After that the remaining bytes are written to the TPM which are interpreted by the tpm as a normal command. (bug Evervolv#3) So if the last bytes of the command stream happen to be a e.g. tpm_force_clear this gets accidentally sent to the TPM. This patch fixes all three bugs, by propagating the error code of tpm_write and returning -E2BIG if the input buffer is too big, since the response from the tpm for a truncated value is bogus anyway. Moreover it returns -EBUSY to userspace if there is a response ready to be read. Signed-off-by: Peter Huewe <[email protected]> Signed-off-by: Kent Yoder <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> udf: fix retun value on error path in udf_load_logicalvol commit 68766a2edcd5cd744262a70a2f67a320ac944760 upstream. In case we detect a problem and bail out, we fail to set "ret" to a nonzero value, and udf_load_logicalvol will mistakenly report success. Signed-off-by: Nikola Pajkovsky <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: ac97 - Fix missing NULL check in snd_ac97_cvol_new() commit 733a48e5ae5bf28b046fad984d458c747cbb8c21 upstream. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=44721 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ALSA: emu10k1: add chip details for E-mu 1010 PCIe card commit 10f571d09106c3eb85951896522c9650596eff2e upstream. Add chip details for E-mu 1010 PCIe card. It has the same chip as found in E-mu 1010b but it uses different PCI id. Signed-off-by: Maxim Kachur <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> ARM: vfp: fix saving d16-d31 vfp registers on v6+ kernels commit 846a136881b8f73c1f74250bf6acfaa309cab1f2 upstream. Michael Olbrich reported that his test program fails when built with -O2 -mcpu=cortex-a8 -mfpu=neon, and a kernel which supports v6 and v7 CPUs: volatile int x = 2; volatile int64_t y = 2; int main() { volatile int a = 0; volatile int64_t b = 0; while (1) { a = (a + x) % (1 << 30); b = (b + y) % (1 << 30); assert(a == b); } } and two instances are run. When built for just v7 CPUs, this program works fine. It uses the "vadd.i64 d19, d18, d16" VFP instruction. It appears that we do not save the high-16 double VFP registers across context switches when the kernel is built for v6 CPUs. Fix that. Tested-By: Michael Olbrich <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. [On Android, this can only be set by root/CAP_MAC_ADMIN processes, and if running SELinux in enforcing mode, only if mac_admin permission is granted in policy. In Android 4.4, this would only be allowed for root/CAP_MAC_ADMIN processes that are also in unconfined domains. In current AOSP master, mac_admin is not allowed for any domains except the recovery console which has a legitimate need for it. The other potential vector is mounting a maliciously crafted filesystem for which SELinux fetches xattrs (e.g. an ext4 filesystem on a SDcard). However, the end result is only a local denial-of-service (DOS) due to kernel BUG. This fix is queued for 3.14.] Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [Evervolv#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec Evervolv#1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a [ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b [ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179 [ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4 [ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31 [ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22 [ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d [ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91 [ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b [ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30 [ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP <ffff8805c0ac3c38> [ 475.328734] ---[ end trace f076482e9d754adc ]--- [sds: commit message edited to note Android implications and to generate a unique Change-Id for gerrit] Change-Id: I4d5389f0cfa72b5f59dada45081fa47e03805413 Reported-by: Matthew Thode <[email protected]> Signed-off-by: Stephen Smalley <[email protected]> Cc: [email protected] Signed-off-by: Paul Moore <[email protected]>

Below Kernel panic is observed due to race condition, where sock_has_perm called in a thread and is trying to access sksec->sid without checking sksec. Just before that, sk->sk_security was set to NULL by selinux_sk_free_security through sk_free in other thread. 31704.949269: <3> IPv4: Attempt to release alive inet socket dd81b200 31704.959049: <1> Unable to handle kernel NULL pointer dereference at \ virtual address 00000000 31704.983562: <1> pgd = c6b74000 31704.985248: <1> [00000000] *pgd=00000000 31704.996591: <0> Internal error: Oops: 5 [Evervolv#1] PREEMPT SMP ARM 31705.001016: <6> Modules linked in: adsprpc [last unloaded: wlan] 31705.006659: <6> CPU: 1 Tainted: G O \ (3.4.0-g837ab9b-00003-g6bcd9c6 Evervolv#1) 31705.014042: <6> PC is at sock_has_perm+0x58/0xd4 31705.018292: <6> LR is at sock_has_perm+0x58/0xd4 31705.022546: <6> pc : [<c0341e8c>] lr : [<c0341e8c>] \ psr: 60000013 31705.022549: <6> sp : dda27f00 ip : 00000000 fp : 5f36fc84 31705.034002: <6> r10: 00004000 r9 : 0000009d r8 : e8c2b700 31705.039211: <6> r7 : dda27f24 r6 : dd81b200 r5 : 00000000 \ r4 : 00000000 31705.045721: <6> r3 : 00000000 r2 : dda27ef8 r1 : 00000000 \ r0 : dda27f54 31705.052232: <6> Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM \ Segment user 31705.059349: <6> Control: 10c5787d Table: 10d7406a DAC: 00000015 . . . . 31705.697816: <6> [<c0341e8c>] (sock_has_perm+0x58/0xd4) from \ [<c033ed10>] (security_socket_getsockopt+0x14/0x1c) 31705.707534: <6> [<c033ed10>] (security_socket_getsockopt+0x14/0x1c) \ from [<c0745c18>] (sys_getsockopt+0x34/0xa8) 31705.717343: <6> [<c0745c18>] (sys_getsockopt+0x34/0xa8) from \ [<c0106140>] (ret_fast_syscall+0x0/0x30) 31705.726193: <0> Code: e59832e8 e5933058 e5939004 ebfac736 (e5953000) 31705.732635: <4> ---[ end trace 22889004dafd87bd ]--- Change-Id: I79c3fb525f35ea2494d53788788cd71a38a32d6b Signed-off-by: Satya Durga Srinivasu Prabhala <[email protected]> Signed-off-by: Osvaldo Banuelos <[email protected]>

Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(60*60*24*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 Evervolv#1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [<ffffffff813d941f>] dump_stack+0x63/0x84 [<ffffffff811b2cb6>] panic+0xde/0x22a [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0 [<ffffffff81350410>] ? key_validate+0x50/0x50 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20 [<ffffffff8126b31c>] seq_read+0x2cc/0x390 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70 [<ffffffff81244fc7>] __vfs_read+0x37/0x150 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0 [<ffffffff81246156>] vfs_read+0x96/0x130 [<ffffffff81247635>] SyS_read+0x55/0xc0 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Change-Id: I0787d5a38c730ecb75d3c08f28f0ab36295d59e7 Reported-by: Ondrej Kozina <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Ondrej Kozina <[email protected]> Signed-off-by: spezi77 <[email protected]>

Merging with current kernel changes

86f6585

zachf714 pushed a commit to zachf714/android_kernel_htc_qsd8k that referenced this pull request Jan 20, 2014

Merge pull request Evervolv#1 from croniccorey/3.0-wip

335e2b8

3.0 wip

zachf714 pushed a commit to zachf714/android_kernel_htc_qsd8k that referenced this pull request Jan 20, 2014

bravo: rtc fixes Evervolv#1

61f7921

zachf714 pushed a commit to zachf714/android_kernel_htc_qsd8k that referenced this pull request Jan 20, 2014

bravo: fixed freezes Evervolv#1

e8096c9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging with current kernel changes #1

Merging with current kernel changes #1

lehmann commented Oct 25, 2013

Merging with current kernel changes #1

Are you sure you want to change the base?

Merging with current kernel changes #1

Conversation

lehmann commented Oct 25, 2013