summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2014-05-05net: Generalize checksum_init functionsTom Herbert
Create a general __skb_checksum_validate function (actually a macro) to subsume the various checksum_init functions. This function can either init the checksum, or do the full validation (logically checksum_init+skb_check_complete)-- a flag specifies if full vaidation is performed. Also, there is a flag to the function to indicate that zero checksums are allowed (to support optional UDP checksums). Added several stub functions for calling __skb_checksum_validate. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-04net: filter: make register naming more comprehensibleDaniel Borkmann
The current code is a bit hard to parse on which registers can be used, how they are mapped and all play together. It makes much more sense to define this a bit more clearly so that the code is a bit more intuitive. This patch cleans this up, and makes naming a bit more consistent among the code. This also allows for moving some of the defines into the header file. Clearing of A and X registers in __sk_run_filter() do not get a particular register name assigned as they have not an 'official' function, but rather just result from the concrete initial mapping of old BPF programs. Since for BPF helper functions for BPF_CALL we already use small letters, so be consistent here as well. No functional changes. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-04net: filter: simplify label names from jump-tableDaniel Borkmann
This patch simplifies label naming for the BPF jump-table. When we define labels via DL(), we just concatenate/textify the combination of instruction opcode which consists of the class, subclass, word size, target register and so on. Each time we leave BPF_ prefix intact, so that e.g. the preprocessor generates a label BPF_ALU_BPF_ADD_BPF_X for DL(BPF_ALU, BPF_ADD, BPF_X) whereas a label name of ALU_ADD_X is much more easy to grasp. Pure cleanup only. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-02tcp: fix cwnd limited checking to improve congestion controlEric Dumazet
Yuchung discovered tcp_is_cwnd_limited() was returning false in slow start phase even if the application filled the socket write queue. All congestion modules take into account tcp_is_cwnd_limited() before increasing cwnd, so this behavior limits slow start from probing the bandwidth at full speed. The problem is that even if write queue is full (aka we are _not_ application limited), cwnd can be under utilized if TSO should auto defer or TCP Small queues decided to hold packets. So the in_flight can be kept to smaller value, and we can get to the point tcp_is_cwnd_limited() returns false. With TCP Small Queues and FQ/pacing, this issue is more visible. We fix this by having tcp_cwnd_validate(), which is supposed to track such things, take into account unsent_segs, the number of segs that we are not sending at the moment due to TSO or TSQ, but intend to send real soon. Then when we are cwnd-limited, remember this fact while we are processing the window of ACKs that comes back. For example, suppose we have a brand new connection with cwnd=10; we are in slow start, and we send a flight of 9 packets. By the time we have received ACKs for all 9 packets we want our cwnd to be 18. We implement this by setting tp->lsnd_pending to 9, and considering ourselves to be cwnd-limited while cwnd is less than twice tp->lsnd_pending (2*9 -> 18). This makes tcp_is_cwnd_limited() more understandable, by removing the GSO/TSO kludge, that tried to work around the issue. Note the in_flight parameter can be removed in a followup cleanup patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-02mdio_bus: implement devm_mdiobus_alloc/devm_mdiobus_freeGrygorii Strashko
Add a resource managed devm_mdiobus_alloc[_size]()/devm_mdiobus_free() to automatically clean up MDIO bus alocations made by MDIO drivers, thus leading to simplified MDIO drivers code. Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-and-tested-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-26at86rf230: use irq_get_trigger_typeAlexander Aring
This patch removes the platform data for the irq_type. We use instead the irq_get_trigger_type function to get these flags which should already configured by the interrupt controller. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/intel/igb/e1000_mac.c net/core/filter.c Both conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22netlink: have netlink per-protocol bind function return an error code.Richard Guy Briggs
Have the netlink per-protocol optional bind function return an int error code rather than void to signal a failure. This will enable netlink protocols to perform extra checks including capabilities and permissions verifications when updating memberships in multicast groups. In netlink_bind() and netlink_setsockopt() the call to the per-protocol bind function was moved above the multicast group update to prevent any access to the multicast socket groups before checking with the per-protocol bind function. This will enable the per-protocol bind function to be used to check permissions which could be denied before making them available, and to avoid the messy job of undoing the addition should the per-protocol bind function fail. The netfilter subsystem seems to be the only one currently using the per-protocol bind function. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22filter: added BPF random opcodeChema Gonzalez
Added a new ancillary load (bpf call in eBPF parlance) that produces a 32-bit random number. We are implementing it as an ancillary load (instead of an ISA opcode) because (a) it is simpler, (b) allows easy JITing, and (c) seems more in line with generic ISAs that do not have "get a random number" as a instruction, but as an OS call. The main use for this ancillary load is to perform random packet sampling. Signed-off-by: Chema Gonzalez <chema@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22ethtool: Support for configurable RSS hash keyVenkata Duvvuru
This ethtool patch primarily copies the ioctl command data structures from/to the User space and invokes the driver hook. Signed-off-by: Venkat Duvvuru <VenkatKumar.Duvvuru@Emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22net: Fix ns_capable check in sock_diag_put_filterinfoAndrew Lutomirski
The caller needs capabilities on the namespace being queried, not on their own namespace. This is a security bug, although it likely has only a minor impact. Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-20net: Add __dev_forward_skbHerbert Xu
This patch adds the helper __dev_forward_skb which is identical to dev_forward_skb except that it doesn't actually inject the skb into the stack. This is useful where we wish to have finer control over how the packet is injected, e.g., via netif_rx_ni or netif_receive_skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-20net: phy: export genphy_config_init()Daniel Mack
This enables other drivers to call this generic implementation, and then only do specific details on top of it. Signed-off-by: Daniel Mack <zonque@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull more networking fixes from David Miller: 1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI context, not synchronize it. From Chris Mason. 2) Ipv4 flow input interface should never be zero, it should be LOOPBACK_IFINDEX instead. From Cong Wang and Julian Anastasov. 3) Properly configure MAC to PHY connection in mvneta devices, from Thomas Petazzoni. 4) sys_recv should use SYSCALL_DEFINE. From Jan Glauber. 5) Tunnel driver ioctls do not use the correct namespace, fix from Nicolas Dichtel. 6) Fix memory leak on seccomp filter attach, from Kees Cook. 7) Fix lockdep warning for nested vlans, from Ding Tianhong. 8) Crashes can happen in SCTP due to how the auth_enable value is managed, fix from Vlad Yasevich. 9) Wireless fixes from John W Linville and co. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) net: sctp: cache auth_enable per endpoint tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled vlan: Fix lockdep warning when vlan dev handle notification seccomp: fix memory leak on filter attach isdn: icn: buffer overflow in icn_command() ip6_tunnel: use the right netns in ioctl handler sit: use the right netns in ioctl handler ip_tunnel: use the right netns in ioctl handler net: use SYSCALL_DEFINEx for sys_recv net: mdio-gpio: Add support for separate MDI and MDO gpio pins net: mdio-gpio: Add support for active low gpio pins net: mdio-gpio: Use devm_ functions where possible ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source() ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll net: mvneta: properly configure the MAC <-> PHY connection in all situations net: phy: add minimal support for QSGMII PHY sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast) mwifiex: fix hung task on command timeout mwifiex: process event before command response ...
2014-04-18Merge tag 'char-misc-3.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a few driver fixes for char/misc drivers that resolve reported issues. All have been in linux-next successfully for a few days" * tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts Tools: hv: Handle the case when the target file exists correctly vme_tsi148: Utilize to_pci_dev() macro vme_tsi148: Fix PCI address mapping assumption vme_tsi148: Fix typo in tsi148_slave_get() w1: avoid recursive device_add w1: fix netlink refcnt leak on error path misc: Grammar s/addition/additional/ drivers: mcb: fix memory leak in chameleon_parse_cells() error path mei: ignore client writing state during cb completion mei: me: do not load the driver if the FW doesn't support MEI interface GenWQE: Increase driver version number GenWQE: Fix multithreading problems GenWQE: Ensure rc is not returning an uninitialized value GenWQE: Add wmb before DDCB is started GenWQE: Enable access to VPD flash area
2014-04-18Merge tag 'driver-core-3.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some driver core fixes for 3.15-rc2. Also in here are some documentation updates, as well as an API removal that had to wait for after -rc1 due to the cleanups coming into you from multiple developer trees (this one and the PPC tree.) All have been in linux next successfully" * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: drivers/base/dd.c incorrect pr_debug() parameters Documentation: Update stable address in Chinese and Japanese translations topology: Fix compilation warning when not in SMP Chinese: add translation of io_ordering.txt stable_kernel_rules: spelling/word usage sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner() kernfs: protect lazy kernfs_iattrs allocation with mutex fs: Don't return 0 from get_anon_bdev
2014-04-18Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: thp: close race between split and zap huge pages mm: fix new kernel-doc warning in filemap.c mm: fix CONFIG_DEBUG_VM_RB description mm: use paravirt friendly ops for NUMA hinting ptes mips: export flush_icache_range mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages() wait: explain the shadowing and type inconsistencies Shiraz has moved Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt powerpc/mm: fix ".__node_distance" undefined kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write() init/Kconfig: move the trusted keyring config option to general setup vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
2014-04-18wait: explain the shadowing and type inconsistenciesPeter Zijlstra
Stick in a comment before someone else tries to fix the sparse warning this generates. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-o2ro6f3vkxklni0bc8f7m68s@git.kernel.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18Shiraz has movedViresh Kumar
shiraz.hashim@st.com email-id doesn't exist anymore as he has left the company. Replace ST's id with shiraz.linux.kernel@gmail.com. It also updates .mailmap file to fix address for 'git shortlog'. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: - mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree - drop deprecated MSI-X API use. - a couple other miscellaneous things. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cxgb4: Fix over-dereference when terminating RDMA/cxgb4: Use uninitialized_var() RDMA/cxgb4: Add missing debug stats RDMA/cxgb4: Initialize reserved fields in a FW work request RDMA/cxgb4: Use pr_warn_ratelimited RDMA/cxgb4: Max fastreg depth depends on DSGL support RDMA/cxgb4: SQ flush fix RDMA/cxgb4: rmb() after reading valid gen bit RDMA/cxgb4: Endpoint timeout fixes RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices IB/mlx5: Add block multicast loopback support IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix()
2014-04-18Merge tag 'dt-fixes-for-3.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - fix error handling in of_update_property - fix section mismatch warnings in __reserved_mem_check_root - add empty of_find_node_by_path for !OF builds - add various missing binding documentation * tag 'dt-fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: add empty of_find_node_by_path() for !OF of: Clean up of_update_property DT: add vendor prefix for EBV Elektronik of: Fix the section mismatch warnings. of: Add vendor prefix for Digi International Inc. DT: I2C: Add trivial bindings used by kirkwood boards DT: Vendor: Add prefixes used by Kirkwood devices DT: bindings: add missing Marvell Kirkwood SoC documentation dt-bindings: add vendor-prefix for Newhaven Display of: add vendor prefix for I2SE GmbH of: add vendor prefix for ISEE 2007 S.L.
2014-04-18of: add empty of_find_node_by_path() for !OFAlexander Shiyan
Add an empty version of of_find_node_by_path(). This fixes following build error for asoc tree: sound/soc/fsl/fsl_ssi.c: In function 'fsl_ssi_probe': sound/soc/fsl/fsl_ssi.c:1471:2: error: implicit declaration of function 'of_find_node_by_path' [-Werror=implicit-function-declaration] sprop = of_get_property(of_find_node_by_path("/"), "compatible", NULL); Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Rob Herring <robh@kernel.org>
2014-04-17Merge branch 'ipmi' (emailed ipmi fixes)Linus Torvalds
Merge ipmi fixes from Corey Minyard: "Things collected since last kernel release. Some of these are pretty important. The first three are bug fixes. The next two are to hopefully make everyone happy about allowing ACPI to be on all the time and not have IPMI have an effect on the system when not in use. The last is a little cleanup" * emailed patches from Corey Minyard <cminyard@mvista.com>: ipmi: boolify some things ipmi: Turn off all activity on an idle ipmi interface ipmi: Turn off default probing of interfaces ipmi: Reset the KCS timeout when starting error recovery ipmi: Fix a race restarting the timer Char: ipmi_bt_sm, fix infinite loop
2014-04-17ipmi: boolify some thingsCorey Minyard
Convert some ints to bools. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17ipmi: Turn off all activity on an idle ipmi interfaceCorey Minyard
The IPMI driver would wake up periodically looking for events and watchdog pretimeouts. If there is nothing waiting for these events, it's really kind of pointless to be checking for them. So modify the driver so the message handler can pass down if it needs the lower layer to be waiting for these. Modify the system interface lower layer to turn off all timer and thread activity if the upper layer doesn't need anything and it is not currently handling messages. And modify the message handler to not restart the timer if its timer is not needed. The timers and kthread will still be enabled if: - the SI interface is handling a message. - a user has enabled watching for events. - the IPMI watchdog timer is in use (since it uses pretimeouts). - the message handler is waiting on a remote response. - a user has registered to receive commands. This mostly affects interfaces without interrupts. Interfaces with interrupts already don't use CPU in the system interface when the interface is idle. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-16Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Various fixes: - reboot regression fix - build message spam fix - GPU quirk fix - 'make kvmconfig' fix plus the wire-up of the renameat2() system call on i386" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove the PCI reboot method from the default chain x86/build: Supress "Nothing to be done for ..." messages x86/gpu: Fix sign extension issue in Intel graphics stolen memory quirks x86/platform: Fix "make O=dir kvmconfig" i386: Wire up the renameat2() syscall
2014-04-16Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hostsK. Y. Srinivasan
Only ws2012r2 hosts support the ability to reconnect to the host on VMBUS. This functionality is needed by kexec in Linux. To use this functionality we need to negotiate version 3.0 of the VMBUS protocol. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16net: mdio-gpio: Add support for separate MDI and MDO gpio pinsGuenter Roeck
This is for a system with fixed assignments of input and output pins (various variants of Kontron COMe). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16net: mdio-gpio: Add support for active low gpio pinsGuenter Roeck
Some systems using mdio-gpio may use active-low gpio pins (eg with inverters or FETs connected to all or some of the gpio pins). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()Tejun Heo
All device_schedule_callback_owner() users are converted to use device_remove_file_self(). Remove now unused {sysfs|device}_schedule_callback_owner(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16net: phy: add minimal support for QSGMII PHYThomas Petazzoni
This commit adds the necessary definitions for the PHY layer to recognize "qsgmii" as a valid PHY interface. A QSMII interface, as defined at http://en.wikipedia.org/wiki/Media_Independent_Interface#Quad_Serial_Gigabit_Media_Independent_Interface, is "is a method of combining four SGMII lines into a 5Gbit/s interface. QSGMII, like SGMII, uses LVDS signalling for the TX and RX data and a single LVDS clock signal. QSGMII uses significantly fewer signal lines than four SGMII busses." This type of MAC <-> PHY connection might require special handling on the MAC driver side, so it should be possible to express this type of MAC <-> PHY connection, for example in the Device Tree. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: devicetree@vger.kernel.org Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16x86: Remove the PCI reboot method from the default chainIngo Molnar
Steve reported a reboot hang and bisected it back to this commit: a4f1987e4c54 x86, reboot: Add EFI and CF9 reboot methods into the default list He heroically tested all reboot methods and found the following: reboot=t # triple fault ok reboot=k # keyboard ctrl FAIL reboot=b # BIOS ok reboot=a # ACPI FAIL reboot=e # EFI FAIL [system has no EFI] reboot=p # PCI 0xcf9 FAIL And I think it's pretty obvious that we should only try PCI 0xcf9 as a last resort - if at all. The other observation is that (on this box) we should never try the PCI reboot method, but close with either the 'triple fault' or the 'BIOS' (terminal!) reboot methods. Thirdly, CF9_COND is a total misnomer - it should be something like CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ... So this patch fixes the worst problems: - it orders the actual reboot logic to follow the reboot ordering pattern - it was in a pretty random order before for no good reason. - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and BOOT_CF9_SAFE flags to make the code more obvious. - it tries the BIOS reboot method before the PCI reboot method. (Since 'BIOS' is a terminal reboot method resulting in a hang if it does not work, this is essentially equivalent to removing the PCI reboot method from the default reboot chain.) - just for the miraculous possibility of terminal (resulting in hang) reboot methods of triple fault or BIOS returning without having done their job, there's an ordering between them as well. Reported-and-bisected-and-tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Li Aubrey <aubrey.li@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix BPF filter validation of netlink attribute accesses, from Mathias Kruase. 2) Netfilter conntrack generation seqcount not initialized properly, from Andrey Vagin. 3) Fix comparison mask computation on big-endian in nft_cmp_fast(), from Patrick McHardy. 4) Properly limit MTU over ipv6, from Eric Dumazet. 5) Fix seccomp system call argument population on 32-bit, from Daniel Borkmann. 6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead skb->mac_len needs to be used. From Vlad Yasevich. 7) We have several cases of using socket based communications to implement a tunnel. For example, some tunnels are encapsulations over UDP so we use an internal kernel UDP socket to do the transmits. These tunnels should behave just like other software devices and pass the packets on down to the next layer. Most importantly we want the top-level socket (eg TCP) that created the traffic to be charged for the SKB memory. However, once you get into the IP output path, we have code that assumed that whatever was attached to skb->sk is an IP socket. To keep the top-level socket being charged for the SKB memory, whilst satisfying the needs of the IP output path, we now pass in an explicit 'sk' argument. From Eric Dumazet. 8) ping_init_sock() leaks group info, from Xiaoming Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) cxgb4: use the correct max size for firmware flash qlcnic: Fix MSI-X initialization code ip6_gre: don't allow to remove the fb_tunnel_dev ipv4: add a sock pointer to dst->output() path. ipv4: add a sock pointer to ip_queue_xmit() driver/net: cosa driver uses udelay incorrectly at86rf230: fix __at86rf230_read_subreg function at86rf230: remove check if AVDD settled net: cadence: Add architecture dependencies net: Start with correct mac_len in skb_network_protocol Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" cxgb4: Save the correct mac addr for hw-loopback connections in the L2T net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF qlcnic: Do not disable SR-IOV when VFs are assigned to VMs qlcnic: Fix QLogic application/driver interface for virtual NIC configuration qlcnic: Fix PVID configuration on eSwitch port. qlcnic: Fix max ring count calculation qlcnic: Fix to send INIT_NIC_FUNC as first mailbox. qlcnic: Fix panic due to uninitialzed delayed_work struct in use. ...
2014-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains three Netfilter fixes for your net tree, they are: * Fix missing generation sequence initialization which results in a splat if lockdep is enabled, it was introduced in the recent works to improve nf_conntrack scalability, from Andrey Vagin. * Don't flush the GRE keymap list in nf_conntrack when the pptp helper is disabled otherwise this crashes due to a double release, from Andrey Vagin. * Fix nf_tables cmp fast in big endian, from Patrick McHardy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_WDaniel Borkmann
While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has been wrongly decoded by commit a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS although it should have been decoded as BPF_LD|BPF_W|BPF_ABS. In practice, this should not have much side-effect though, as such conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse operation PR_GET_SECCOMP will only return the current seccomp mode, but not the filter itself. Since the transition to the new BPF infrastructure, it's also not used anymore, so we can simply remove this as it's unreachable. Fixes: a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13Merge branch 'slab/next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull slab changes from Pekka Enberg: "The biggest change is byte-sized freelist indices which reduces slab freelist memory usage: https://lkml.org/lkml/2013/12/2/64" * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: mm: slab/slub: use page->list consistently instead of page->lru mm/slab.c: cleanup outdated comments and unify variables naming slab: fix wrongly used macro slub: fix high order page allocation problem with __GFP_NOFAIL slab: Make allocations with GFP_ZERO slightly more efficient slab: make more slab management structure off the slab slab: introduce byte sized index for the freelist of a slab slab: restrict the number of objects in a slab slab: introduce helper functions to get/set free object slab: factor out calculate nr objects in cache_estimate
2014-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull yet more networking updates from David Miller: 1) Various fixes to the new Redpine Signals wireless driver, from Fariya Fatima. 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from Dmitry Petukhov. 3) UFO and TSO packets differ in whether they include the protocol header in gso_size, account for that in skb_gso_transport_seglen(). From Florian Westphal. 4) If VLAN untagging fails, we double free the SKB in the bridging output path. From Toshiaki Makita. 5) Several call sites of sk->sk_data_ready() were referencing an SKB just added to the socket receive queue in order to calculate the second argument via skb->len. This is dangerous because the moment the skb is added to the receive queue it can be consumed in another context and freed up. It turns out also that none of the sk->sk_data_ready() implementations even care about this second argument. So just kill it off and thus fix all these use-after-free bugs as a side effect. 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti. 7) pktgen needs to do locking properly for LLTX devices, from Daniel Borkmann. 8) xen-netfront driver initializes TX array entries in RX loop :-) From Vincenzo Maffione. 9) After refactoring, some tunnel drivers allow a tunnel to be configured on top itself. Fix from Nicolas Dichtel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vti: don't allow to add the same tunnel twice gre: don't allow to add the same tunnel twice drivers: net: xen-netfront: fix array initialization bug pktgen: be friendly to LLTX devices r8152: check RTL8152_UNPLUG net: sun4i-emac: add promiscuous support net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO net: ipv6: Fix oif in TCP SYN+ACK route lookup. drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts drivers: net: cpsw: discard all packets received when interface is down net: Fix use after free by removing length arg from sk_data_ready callbacks. Drivers: net: hyperv: Address UDP checksum issues Drivers: net: hyperv: Negotiate suitable ndis version for offload support Drivers: net: hyperv: Allocate memory for all possible per-pecket information bridge: Fix double free and memory leak around br_allowed_ingress bonding: Remove debug_fs files when module init fails i40evf: program RSS LUT correctly i40evf: remove open-coded skb_cow_head ixgb: remove open-coded skb_cow_head igbvf: remove open-coded skb_cow_head ...
2014-04-12Merge tag 'llvmlinux-for-v3.15' of ↵Linus Torvalds
git://git.linuxfoundation.org/llvmlinux/kernel Pull llvm patches from Behan Webster: "These are some initial updates to support compiling the kernel with clang. These patches have been through the proper reviews to the best of my ability, and have been soaking in linux-next for a few weeks. These patches by themselves still do not completely allow clang to be used with the kernel code, but lay the foundation for other patches which are still under review. Several other of the LLVMLinux patches have been already added via maintainer trees" * tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel: x86: LLVMLinux: Fix "incomplete type const struct x86cpu_device_id" x86 kbuild: LLVMLinux: More cc-options added for clang x86, acpi: LLVMLinux: Remove nested functions from Thinkpad ACPI LLVMLinux: Add support for clang to compiler.h and new compiler-clang.h LLVMLinux: Remove warning about returning an uninitialized variable kbuild: LLVMLinux: Fix LINUX_COMPILER definition script for compilation with clang Documentation: LLVMLinux: Update Documentation/dontdiff kbuild: LLVMLinux: Adapt warnings for compilation with clang kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang
2014-04-12Merge tag 'ntb-3.15' of git://github.com/jonmason/ntbLinus Torvalds
Pull PCIe non-transparent bridge fixes and features from Jon Mason: "NTB driver bug fixes to address issues in list traversal, skb leak in ntb_netdev, a typo, and a leak of msix entries in the error path. Clean ups of the event handling logic, as well as a overall style cleanup. Finally, the driver was converted to use the new pci_enable_msix_range logic (and the refactoring to go along with it)" * tag 'ntb-3.15' of git://github.com/jonmason/ntb: ntb: Use pci_enable_msix_range() instead of pci_enable_msix() ntb: Split ntb_setup_msix() into separate BWD/SNB routines ntb: Use pci_msix_vec_count() to obtain number of MSI-Xs NTB: Code Style Clean-up NTB: client event cleanup ntb: Fix leakage of ntb_device::msix_entries[] array NTB: Fix typo in setting one translation register ntb_netdev: Fix skb free issue in open ntb_netdev: Fix list_for_each_entry exit issue
2014-04-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The first vfs pile, with deep apologies for being very late in this window. Assorted cleanups and fixes, plus a large preparatory part of iov_iter work. There's a lot more of that, but it'll probably go into the next merge window - it *does* shape up nicely, removes a lot of boilerplate, gets rid of locking inconsistencie between aio_write and splice_write and I hope to get Kent's direct-io rewrite merged into the same queue, but some of the stuff after this point is having (mostly trivial) conflicts with the things already merged into mainline and with some I want more testing. This one passes LTP and xfstests without regressions, in addition to usual beating. BTW, readahead02 in ltp syscalls testsuite has started giving failures since "mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages" - might be a false positive, might be a real regression..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) missing bits of "splice: fix racy pipe->buffers uses" cifs: fix the race in cifs_writev() ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure kill generic_file_buffered_write() ocfs2_file_aio_write(): switch to generic_perform_write() ceph_aio_write(): switch to generic_perform_write() xfs_file_buffered_aio_write(): switch to generic_perform_write() export generic_perform_write(), start getting rid of generic_file_buffer_write() generic_file_direct_write(): get rid of ppos argument btrfs_file_aio_write(): get rid of ppos kill the 5th argument of generic_file_buffered_write() kill the 4th argument of __generic_file_aio_write() lustre: don't open-code kernel_recvmsg() ocfs2: don't open-code kernel_recvmsg() drbd: don't open-code kernel_recvmsg() constify blk_rq_map_user_iov() and friends lustre: switch to kernel_sendmsg() ocfs2: don't open-code kernel_sendmsg() take iov_iter stuff to mm/iov_iter.c process_vm_access: tidy up a bit ...
2014-04-12Merge tag 'trace-3.15-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: "This includes the final patch to clean up and fix the issue with the design of tracepoints and how a user could register a tracepoint and have that tracepoint not be activated but no error was shown. The design was for an out of tree module but broke in tree users. The clean up was to remove the saving of the hash table of tracepoint names such that they can be enabled before they exist (enabling a module tracepoint before that module is loaded). This added more complexity than needed. The clean up was to remove that code and just enable tracepoints that exist or fail if they do not. This removed a lot of code as well as the complexity that it brought. As a side effect, instead of registering a tracepoint by its name, the tracepoint needs to be registered with the tracepoint descriptor. This removes having to duplicate the tracepoint names that are enabled. The second patch was added that simplified the way modules were searched for. This cleanup required changes that were in the 3.15 queue as well as some changes that were added late in the 3.14-rc cycle. This final change waited till the two were merged in upstream and then the change was added and full tests were run. Unfortunately, the test found some errors, but after it was already submitted to the for-next branch and not to be rebased. Sparse errors were detected by Fengguang Wu's bot tests, and my internal tests discovered that the anonymous union initialization triggered a bug in older gcc compilers. Luckily, there was a bugzilla for the gcc bug which gave a work around to the problem. The third and fourth patch handled the sparse error and the gcc bug respectively. A final patch was tagged along to fix a missing documentation for the README file" * tag 'trace-3.15-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Add missing function triggers dump and cpudump to README tracing: Fix anonymous unions in struct ftrace_event_call tracepoint: Fix sparse warnings in tracepoint.c tracepoint: Simplify tracepoint module search tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints
2014-04-12Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit updates from Eric Paris. * git://git.infradead.org/users/eparis/audit: (28 commits) AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range audit: do not cast audit_rule_data pointers pointlesly AUDIT: Allow login in non-init namespaces audit: define audit_is_compat in kernel internal header kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c sched: declare pid_alive as inline audit: use uapi/linux/audit.h for AUDIT_ARCH declarations syscall_get_arch: remove useless function arguments audit: remove stray newline from audit_log_execve_info() audit_panic() call audit: remove stray newlines from audit_log_lost messages audit: include subject in login records audit: remove superfluous new- prefix in AUDIT_LOGIN messages audit: allow user processes to log from another PID namespace audit: anchor all pid references in the initial pid namespace audit: convert PPIDs to the inital PID namespace. pid: get pid_t ppid of task in init_pid_ns audit: rename the misleading audit_get_context() to audit_take_context() audit: Add generic compat syscall support audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL ...
2014-04-11Merge git://git.infradead.org/users/willy/linux-nvmeLinus Torvalds
Pull NVMe driver updates from Matthew Wilcox: "Various updates to the NVMe driver. The most user-visible change is that drive hotplugging now works and CPU hotplug while an NVMe drive is installed should also work better" * git://git.infradead.org/users/willy/linux-nvme: NVMe: Retry failed commands with non-fatal errors NVMe: Add getgeo to block ops NVMe: Start-stop nvme_thread during device add-remove. NVMe: Make I/O timeout a module parameter NVMe: CPU hot plug notification NVMe: per-cpu io queues NVMe: Replace DEFINE_PCI_DEVICE_TABLE NVMe: Fix divide-by-zero in nvme_trans_io_get_num_cmds NVMe: IOCTL path RCU protect queue access NVMe: RCU protected access to io queues NVMe: Initialize device reference count earlier NVMe: Add CONFIG_PM_SLEEP to suspend/resume functions
2014-04-11Merge tag 'pm+acpi-3.15-rc1-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI and power management fixes and updates from Rafael Wysocki: "This is PM and ACPI material that has emerged over the last two weeks and one fix for a CPU hotplug regression introduced by the recent CPU hotplug notifiers registration series. Included are intel_idle and turbostat updates from Len Brown (these have been in linux-next for quite some time), a new cpufreq driver for powernv (that might spend some more time in linux-next, but BenH was asking me so nicely to push it for 3.15 that I couldn't resist), some cpufreq fixes and cleanups (including fixes for some silly breakage in a couple of cpufreq drivers introduced during the 3.14 cycle), assorted ACPI cleanups, wakeup framework documentation fixes, a new sysfs attribute for cpuidle and a new command line argument for power domains diagnostics. Specifics: - Fix for a recently introduced CPU hotplug regression in ARM KVM from Ming Lei. - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32 cpufreq drivers introduced during the 3.14 cycle (-stable material) from Chen Gang and Viresh Kumar. - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits from Gautham R Shenoy and Srivatsa S Bhat. - Exynos cpufreq driver fix preventing it from being included into multiplatform builds that aren't supported by it from Sachin Kamat. - cpufreq cleanups related to the usage of the driver_data field in struct cpufreq_frequency_table from Viresh Kumar. - cpufreq ppc driver cleanup from Sachin Kamat. - Intel BayTrail support for intel_idle and ACPI idle from Len Brown. - Intel CPU model 54 (Atom N2000 series) support for intel_idle from Jan Kiszka. - intel_idle fix for Intel Ivy Town residency targets from Len Brown. - turbostat updates (Intel Broadwell support and output cleanups) from Len Brown. - New cpuidle sysfs attribute for exporting C-states' target residency information to user space from Daniel Lezcano. - New kernel command line argument to prevent power domains enabled by the bootloader from being turned off even if they are not in use (for diagnostics purposes) from Tushar Behera. - Fixes for wakeup sysfs attributes documentation from Geert Uytterhoeven. - New ACPI video blacklist entry for ThinkPad Helix from Stephen Chandler Paul. - Assorted ACPI cleanups and a Kconfig help update from Jonghwan Choi, Zhihui Zhang, Hanjun Guo" * tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits) ACPI: Update the ACPI spec information in Kconfig arm, kvm: fix double lock on cpu_add_remove_lock cpuidle: sysfs: Export target residency information cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h cpufreq: create another field .flags in cpufreq_frequency_table cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table cpufreq: don't print value of .driver_data from core cpufreq: ia64: don't set .driver_data to index cpufreq: powernv: Select CPUFreq related Kconfig options for powernv cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids cpufreq: powernv: cpufreq driver for powernv platform cpufreq: at32ap: don't declare local variable as static cpufreq: loongson2_cpufreq: don't declare local variable as static cpufreq: unicore32: fix typo issue for 'clk' cpufreq: exynos: Disable on multiplatform build PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes PM / domains: Add pd_ignore_unused to keep power domains enabled ACPI / dock: Drop dock_device_ids[] table ACPI / video: Favor native backlight interface for ThinkPad Helix ACPI / thermal: Fix wrong variable usage in debug statement ...
2014-04-11net: Fix use after free by removing length arg from sk_data_ready callbacks.David S. Miller
Several spots in the kernel perform a sequence like: skb_queue_tail(&sk->s_receive_queue, skb); sk->sk_data_ready(sk, skb->len); But at the moment we place the SKB onto the socket receive queue it can be consumed and freed up. So this skb->len access is potentially to freed up memory. Furthermore, the skb->len can be modified by the consumer so it is possible that the value isn't accurate. And finally, no actual implementation of this callback actually uses the length argument. And since nobody actually cared about it's value, lots of call sites pass arbitrary values in such as '0' and even '1'. So just remove the length argument from the callback, that way there is no confusion whatsoever and all of these use-after-free cases get fixed as a side effect. Based upon a patch by Eric Dumazet and his suggestion to audit this issue tree-wide. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11mm: slab/slub: use page->list consistently instead of page->lruDave Hansen
'struct page' has two list_head fields: 'lru' and 'list'. Conveniently, they are unioned together. This means that code can use them interchangably, which gets horribly confusing like with this nugget from slab.c: > list_del(&page->lru); > if (page->active == cachep->num) > list_add(&page->list, &n->slabs_full); This patch makes the slab and slub code use page->lru universally instead of mixing ->list and ->lru. So, the new rule is: page->lru is what the you use if you want to keep your page on a list. Don't like the fact that it's not called ->list? Too bad. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
2014-04-10IB/mlx5: Add block multicast loopback supportEli Cohen
Add support for the block multicast loopback QP creation flag along the proper firmware API for that. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-10AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERICChris Metcalf
On systems with CONFIG_COMPAT we introduced the new requirement that audit_classify_compat_syscall() exists. This wasn't true for everything (apparently not for "tilegx", which I know less that nothing about.) Instead of wrapping the preprocessor optomization with CONFIG_COMPAT we should have used the new CONFIG_AUDIT_COMPAT_GENERIC. This patch uses that config option to make sure only arches which intend to implement this have the requirement. This works fine for tilegx according to Chris Metcalf Signed-off-by: Eric Paris <eparis@redhat.com>
2014-04-10NVMe: Retry failed commands with non-fatal errorsKeith Busch
For commands returned with failed status, queue these for resubmission and continue retrying them until success or for a limited amount of time. The final timeout was arbitrarily chosen so requests can't be retried indefinitely. Since these are requeued on the nvmeq that submitted the command, the callbacks have to take an nvmeq instead of an nvme_dev as a parameter so that we can use the locked queue to append the iod to retry later. The nvme_iod conviently can be used to track how long we've been trying to successfully complete an iod request. The nvme_iod also provides the nvme prp dma mappings, so I had to move a few things around so we can keep those mappings. Signed-off-by: Keith Busch <keith.busch@intel.com> [fixed checkpatch issue with long line] Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2014-04-10NVMe: Make I/O timeout a module parameterKeith Busch
Increase the default timeout to 30 seconds to match SCSI. Signed-off-by: Keith Busch <keith.busch@intel.com> [use byte instead of ushort] Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>