summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-02-28packet: allow to transmit +4 byte in TX_RING slot for VLAN caseDaniel Borkmann
Commit 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.") added the possibility for non-mmaped frames to send extra 4 byte for VLAN header so the MTU increases from 1500 to 1504 byte, for example. Commit cbd89acb9eb2 ("af_packet: fix for sending VLAN frames via packet_mmap") attempted to fix that for the mmap part but was reverted as it caused regressions while using eth_type_trans() on output path. Lets just act analogous to 57f89bfa2140 and add a similar logic to TX_RING. We presume size_max as overcharged with +4 bytes and later on after skb has been built by tpacket_fill_skb() check for ETH_P_8021Q header on packets larger than normal MTU. Can be easily reproduced with a slightly modified trafgen in mmap(2) mode, test cases: { fill(0xff, 12) const16(0x8100) fill(0xff, <1504|1505>) } { fill(0xff, 12) const16(0x0806) fill(0xff, <1500|1501>) } Note that we need to do the test right after tpacket_fill_skb() as sockets can have PACKET_LOSS set where we would not fail but instead just continue to traverse the ring. Reported-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Ben Greear <greearb@candelatech.com> Cc: Phil Sutter <phil@nwl.cc> Tested-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28Merge branch 'intel-next'David S. Miller
Aaron Brown says: ==================== This series contains updates to ixgbe and ixgbevf. Don provides an update to change a hard coded timeout interval to a system-wide timeout one, collects AUTOC register functions into one place and fixes some firmware bit handling. Emil resolves a tx handling error introduced in a recent commit and adds check for CHECKSUM_PARTIAL to avoid an skb_is_gso check ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28ixgbevf: add check for CHECKSUM_PARTIAL when doing TSOEmil Tantilov
This patch adds check for CHECKSUM_PARTIAL to avoid the skb_is_gso check in ixgbevf_tso(). It should reduce overhead for workloads that are not using TSO or checksum offloads. It is the same as in ixgbe. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28ixgbevf: fix handling of tx checksummingEmil Tantilov
This patch resolves an issue introduced by: commit 7ad1a093519e37fb673579819bf6af122641c397 ixgbevf: make the first tx_buffer a repository for most of the skb info Incorrect check for the result of ixgbevf_tso() can lead to calling ixgbevf_tx_csum() which can spawn 2 context descriptors and result in performance degradation and/or corrupted packets. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28ixgbe: Add check for FW veto bitDon Skidmore
The driver will now honor the MNG FW veto bit in blocking link resets. This patch will affect x520 and x540 systems. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28ixgbe: fix bit toggled for 82599 reset fix.Don Skidmore
The current code doesn't toggle the correct bit to reset the data pipeline on Restart_AN assertion. This patch corrects that. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28ixgbe: collect all 82599 AUTOC code in one functionDon Skidmore
When reading or writing to the AUTOC register on 82599 devices we need to preform various operations that aren't needed for other MAC types. This patch will collect all of that code into one place to minimize MAC checks in common code paths. While doing this I also clean up some cases where we weren't holding the SW/FW semaphore during a read/modify/write of AUTOC. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28ixgbe: fix to use correct timeout interval for memory read completionDon Skidmore
Currently we were just always polling for a hard coded 80 ms and not respecting the system-wide timeout interval. Since up until now all devices have been tested with this 80ms value we continue to use this value as a hard minimum. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27ipv6: addrconf: silence sparse endianness warningsBjørn Mork
Avoid the following sparse __CHECK_ENDIAN__ warnings: include/net/addrconf.h:318:25: warning: restricted __be64 degrades to integer include/net/addrconf.h:318:70: warning: restricted __be64 degrades to integer include/net/addrconf.h:330:25: warning: restricted __be64 degrades to integer include/net/addrconf.h:330:70: warning: restricted __be64 degrades to integer include/net/addrconf.h:347:25: warning: restricted __be64 degrades to integer include/net/addrconf.h:348:26: warning: restricted __be64 degrades to integer include/net/addrconf.h:349:18: warning: restricted __be64 degrades to integer The warnings are false but they make it harder to spot real bugs. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27neigh: directly goto out after setting nud_state to NUD_FAILEDDuan Jiong
Because those following if conditions will not be matched. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== This is the rework of the IPsec virtual tunnel interface for ipv4 to support inter address family tunneling and namespace crossing. The only change to the last RFC version is a compile fix for an odd configuration where CONFIG_XFRM is set but CONFIG_INET is not set. 1) Add and use a IPsec protocol multiplexer. 2) Add xfrm_tunnel_skb_cb to the skb common buffer to store a receive callback there. 3) Make vti work with i_key set by not including the i_key when comupting the hash for the tunnel lookup in case of vti tunnels. 4) Update ip_vti to use it's own receive hook. 5) Remove xfrm_tunnel_notifier, this is replaced by the IPsec protocol multiplexer. 6) We need to be protocol family indepenent, so use the on xfrm_lookup returned dst_entry instead of the ipv4 rtable in vti_tunnel_xmit(). 7) Add support for inter address family tunneling. 8) Check if the tunnel endpoints of the xfrm state and the vti interface are matching and return an error otherwise. 8) Enable namespace crossing tor vti devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27Merge branch 'kdoc'David S. Miller
Luis R. Rodriguez says: ==================== net: start kdoc'ifying net_device While working on extending some functionality I felt restricted with the amount of documentation I can add. Part of this is that the existing style on the header files don't let me be verbose. This starts addressing that by using kdoc for the net_device flags, and as Ben noted, the priv_flags can be moved out from UAPI. Luis R. Rodriguez (2): net: kdoc struct net_device flags and priv_flags net: move net_device priv_flags out from UAPI ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27net: move net_device priv_flags out from UAPILuis R. Rodriguez
These are private to userspace, and they're unstable anyway and can be shuffled at will (see 080e4130b1fb) so any userspace application relying on them is on crack. Test compiled with allyesconfig. mcgrof@drvbp1 /pub/mem/mcgrof/net-next (git::master)$ make allyesconfig mcgrof@drvbp1 /pub/mem/mcgrof/net-next (git::master)$ time make -j 20 ... BUILD arch/x86/boot/bzImage Setup is 16992 bytes (padded to 17408 bytes). System is 56153 kB CRC 721d2751 Kernel: arch/x86/boot/bzImage is ready (#1) real 19m35.744s user 280m37.984s sys 27m54.104s Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27net: kdoc struct net_device flags and priv_flagsLuis R. Rodriguez
We have documentation for these flags but they're scattered all over the place. #defines don't allow documentation to be written easily so to help to start bringing some documentation together use the enums kdoc practice but keep the defines to allow userspace to be able to #ifdef them. I've verified the same values are assigned before and after with a simple userspace test program [0] and checksumming the output. [0] http://drvbp1.linux-foundation.org/~mcgrof/kdoc/netdev_flags/ mcgrof@gnat ~/tmp $ ./check-flags | sha1sum 0ec5b6b1840aa3bb9ce464e61c564820871c92c3 - Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27atm: nicstar: remove interruptible_sleep_on_timeoutArnd Bergmann
We are trying to finally kill off interruptible_sleep_on_timeout. the two uses in the nicstar driver can be trivially replaced with wait_event_interruptible_lock_irq_timeout, which prevents the wake-up race and is able to check the buffer state with scq->lock held. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26tcp: switch rtt estimations to usec resolutionEric Dumazet
Upcoming congestion controls for TCP require usec resolution for RTT estimations. Millisecond resolution is simply not enough these days. FQ/pacing in DC environments also require this change for finer control and removal of bimodal behavior due to the current hack in tcp_update_pacing_rate() for 'small rtt' TCP_CONG_RTT_STAMP is no longer needed. As Julian Anastasov pointed out, we need to keep user compatibility : tcp_metrics used to export RTT and RTTVAR in msec resolution, so we added RTT_US and RTTVAR_US. An iproute2 patch is needed to use the new attributes if provided by the kernel. In this example ss command displays a srtt of 32 usecs (10Gbit link) lpk51:~# ./ss -i dst lpk52 Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port tcp ESTAB 0 1 10.246.11.51:42959 10.246.11.52:64614 cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448 cwnd:10 send 3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559 Updated iproute2 ip command displays : lpk51:~# ./ip tcp_metrics | grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source 10.246.11.51 Old binary displays : lpk51:~# ip tcp_metrics | grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source 10.246.11.51 With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Larry Brakmo <brakmo@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net: add skb_mstamp infrastructureEric Dumazet
ktime_get() is too expensive on some cases, and we'd like to get usec resolution timestamps in TCP stack. This patch adds a light weight facility using a combination of local_clock() and jiffies samples. Instead of : u64 t0, t1; t0 = ktime_get(); // stuff t1 = ktime_get(); delta_us = ktime_us_delta(t1, t0); use : struct skb_mstamp t0, t1; skb_mstamp_get(&t0); // stuff skb_mstamp_get(&t1); delta_us = skb_mstamp_us_delta(&t1, &t0); Note : local_clock() might have a (bounded) drift between cpus. Do not use this infra in place of ktime_get() without understanding the issues. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Larry Brakmo <brakmo@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26phy: micrel: add of configuration for LED modeBen Dooks
Add support for the led-mode property for the following PHYs which have a single LED mode configuration value. KSZ8001 and KSZ8041 which both use register 0x1e bits 15,14 and KSZ8021, KSZ8031 and KSZ8051 which use register 0x1f bits 5,4 to control the LED configuration. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26isdn: fix multiple sleep_on racesArnd Bergmann
The isdn core code uses a couple of wait queues with interruptible_sleep_on, which is racy and about to get removed from the kernel. Fortunately, we know for each case what we are waiting for, so they can all be converted to the better wait_event_interruptible interface. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26isdn: divert, hysdn: fix interruptible_sleep_on raceArnd Bergmann
These two drivers use identical code for their procfs status file handling, which contains a small race against status data becoming available while reading the file. This uses wait_event_interruptible instead to fix this particular race and eventually get rid of all sleep_on instances. There seems to be another race involving multiple concurrent readers of the same procfs file, which I don't try to fix here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26isdn: hisax/elsa: fix sleep_on race in elsa FSMArnd Bergmann
The state machine code in the elsa driver uses interruptible_sleep_on to wait for state changes, which is racy. A closer look at the possible states reveals that it is always used to wait for getting back into ARCOFI_NOP, so we can use wait_event_interruptible instead. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26isdn: pcbit: fix interruptible_sleep_on raceArnd Bergmann
interruptible_sleep_on is racy and going away. In case of pcbit, the driver would run into a timeout if the card is initialized before we start waiting for it. This uses wait_event to fix the race. In order to do this, the state machine handling for the timeout case has to get trivially reorganized so we actually know whether the timeout has occorred or not. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26atm: firestream: fix interruptible_sleep_on raceArnd Bergmann
interruptible_sleep_on is racy and going away. This replaces the one use in the firestream driver with the appropriate wait_event_interruptible variant. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Chas Williams <chas@cmf.nrl.navy.mil> Cc: linux-atm-general@lists.sourceforge.net Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26Merge branch 'intel-next'David S. Miller
Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates to ixgbe, igb and documentation. The first four have been sent up as part of other series where 1 or more in the series were rejected and either dropped or still being worked on for reasons unrelated to these patches. Don makes recovery from a HW ECC error just schedule a reset as it turns out the previous behaviour of forcing the user to reload is not necessary. Mark adds WoL support to port 0 of a new device. Jacob removes a magic number from the ptp_caps.name and updates the SubmittingPatches documentation with details on the Fixed: tag. And Carolyn updates igb files to remove the FSF physical mail address. [ DaveM Note: SubmittingPatches change omitted, will go via LKML ] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26igb: Update license text to remove FSF address and update copyright.Carolyn Wyborny
This patch updates the license text to remove address of Free Software Foundation and refer users to www.gnu.org instead. This patch also updates the copyright dates in appropriate igb driver files. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26igb: make local functions static and remove dead codeJeff Kirsher
Based on Stephen Hemminger's original patch. Make local functions static, and remove unused functions. Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ixgbe: Add WoL support for a new deviceMark Rustad
Add WoL support for port 0 of a new 82599-based device. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ixgbe: don't use magic size number to assign ptp_caps.nameJacob Keller
Rather than using a magic size number, just use sizeof since that will work and is more robust to future changes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ixgbe: modify behavior on receiving a HW ECC error.Don Skidmore
Currently when we noticed a HW ECC error we would request the use reload the driver to force a reset of the part. This was done due to the mistaken believe that a normal reset would not be sufficient. Well it turns out it would be so now we just schedule a reset upon seeing the ECC. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMITHannes Frederic Sowa
This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which got recently introduced. It doesn't honor the path mtu discovered by the host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of fragments if the packet size exceeds the MTU of the outgoing interface MTU. Fixes: 93b36cf3425b9b ("ipv6: support IPV6_PMTU_INTERFACE on sockets") Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMITHannes Frederic Sowa
IP_PMTUDISC_INTERFACE has a design error: because it does not allow the generation of fragments if the interface mtu is exceeded, it is very hard to make use of this option in already deployed name server software for which I introduced this option. This patch adds yet another new IP_MTU_DISCOVER option to not honor any path mtu information and not accepting new icmp notifications destined for the socket this option is enabled on. But we allow outgoing fragmentation in case the packet size exceeds the outgoing interface mtu. As such this new option can be used as a drop-in replacement for IP_PMTUDISC_DONT, which is currently in use by most name server software making the adoption of this option very smooth and easy. The original advantage of IP_PMTUDISC_INTERFACE is still maintained: ignoring incoming path MTU updates and not honoring discovered path MTUs in the output path. Fixes: 482fc6094afad5 ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE") Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv4: use ip_skb_dst_mtu to determine mtu in ip_fragmentHannes Frederic Sowa
ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket is attached to the skb (in case of forwarding) or determines the mtu like we do in ip_finish_output, which actually checks if we should branch to ip_fragment. Thus use the same function to determine the mtu here, too. This is important for the introduction of IP_PMTUDISC_OMIT, where we want the packets getting cut in pieces of the size of the outgoing interface mtu. IPv6 already does this correctly. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26neigh: probe application via netlink in NUD_PROBETimo Teräs
iproute2 arpd seems to expect this as there's code and comments to handle netlink probes with NUD_PROBE set. It is used to flush the arpd cached mappings. opennhrp instead turns off unicast probes (so it can handle all neighbour discovery). Without this change it will not see NUD_PROBE probes and cannot reconfirm the mapping. Thus currently neigh entry will just fail and can cause few packets dropped until broadcast discovery is restarted. Earlier discussion on the subject: http://marc.info/?t=139305877100001&r=1&w=2 Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ieee802154: fix new function declarationJean Sacren
The commit 8fad346f366a7 ("eee802154: add basic support for RF212 to at86rf230 driver") introduced the new function is_rf212() with some minor issues in declaration: 1) Fix the function type by changing it to bool as the function definition returns a boolean value. Additionally both callers of is_rf212() are expected to return a boolean value. 2) Fix the function specifier by deleting the inline keyword as the compiler takes care of that. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv6: log src and dst along with "udp checksum is 0"Bjørn Mork
These info messages are rather pointless without any means to identify the source of the bogus packets. Logging the src and dst addresses and ports may help a bit. Cc: Joe Perches <joe@perches.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26vxlan: remove unused port variable in vxlan_udp_encap_recv()Pablo Neira Ayuso
Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26Merge branch 'mlx4'David S. Miller
Amir Vadai says: ==================== net, net/mlx4: Add sysfs file for port number Modern distro's are using biosdevname to rename interface to a name based on slot/port number. biosdevname can't get the port number of devices that have multiple ports that share the same PCI function. This patch adds a sysfs file under: /sys/devices/.../net/<interface>/dev_port, that contains the port number (0 based) - to be used by biosdevname. Also, dev_id was wrongly used in mlx4_en driver - added a patch that fix it. This patch was tested and applied over commit 51adfcc "net: bcmgenet: remove unused bh_lock member" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net/mlx4_en: Fix bad use of dev_idAmir Vadai
dev_id should be set for multiple netdev's sharing the same MAC, which is not the case here. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net/mlx4_en: Expose port number through sysfsAmir Vadai
Initialize dev_port with port number (0 based) to be accessed through sysfs from user space. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net: Add sysfs file for port numberAmir Vadai
Add a sysfs file to enable user space to query the device port number used by a netdevice instance. This is needed for devices that have multiple ports on the same PCI function. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26Merge branch 'bnx2x'David S. Miller
Michal Schmidt says: ==================== bnx2x: minimize RAM usage in kdump kdump kernels usually have only a small amount of memory reserved. bnx2x can be memory-hungry. Let's minimize its memory usage when running in kdump. I detect kdump by looking at the "reset_devices" flag. A couple of storage drivers (cciss, hpsa) use it for the same purpose. I am not sure this is the best way to solve the problem, but it works. Should it be made more generic by, say, looking at the total amount of lowmem instead? Not using TPA by default when lowmem is small and/or defaulting to fewer queues would help 32bit systems where a driver for a multi-function multi-queue NIC can consume a significant amount of available memory. Or do we want no such heuristics? Is this something to consider doing for other network drivers too? ==================== Acked-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bnx2x: save RAM in kdump kernel by disabling TPAMichal Schmidt
When running in a kdump kernel, disable TPA. This saves memory, which tends to be scarce in kdump. TPA, being a receive acceleration, is unlikely to be useful for kdump, whose purpose is to send the memory image out. This saves additional 5 MB in the kdump environment. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bnx2x: save RAM in kdump kernel by using a single queueMichal Schmidt
When running in a kdump kernel, make sure to use only a single ethernet queue even if a num_queues option in /etc/modprobe.d/*.conf would specify otherwise. This saves memory, which tends to be scarce in kdump. This saves about 40 MB in the kdump environment on a setup with num_queues=8 in the config file. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bnx2x: clamp num_queues to prevent passing a negative valueMichal Schmidt
Use the clamp() macro to make the calculation of the number of queues slightly easier to understand. It also avoids a crash when someone accidentally passes a negative value in num_queues= module parameter. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net: tcp: add mib counters to track zero window transitionsFlorian Westphal
Three counters are added: - one to track when we went from non-zero to zero window - one to track the reverse - one counter incremented when we want to announce zero window, but can't because we would shrink current window. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net: order MPLS ethertypes numericallyNeil Jerram
All ethertypes other than ETH_P_MPLS_UC, ETH_P_MPLS_MC and ETH_P_ATMMPOA were already ordered numerically. This commit moves those three ETH_P_... values into correct numerical order too. Signed-off-by: Neil Jerram <Neil.Jerram@metaswitch.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25bnx2x: Remove hidden flow control goto from BNX2X_ALLOC macrosJoe Perches
BNX2X_ALLOC macros use "goto alloc_mem_err" so these labels appear unused in some functions. Expand these macros in-place via coccinelle and some typing. Update the macros to use statement expressions and remove the BNX2X_ALLOC macro. This adds some > 80 char lines. $ cat bnx2x_pci_alloc.cocci @@ expression e1; expression e2; expression e3; @@ - BNX2X_PCI_ALLOC(e1, e2, e3); + e1 = BNX2X_PCI_ALLOC(e2, e3); if (!e1) goto alloc_mem_err; @@ expression e1; expression e2; expression e3; @@ - BNX2X_PCI_FALLOC(e1, e2, e3); + e1 = BNX2X_PCI_FALLOC(e2, e3); if (!e1) goto alloc_mem_err; @@ expression e1; expression e2; @@ - BNX2X_ALLOC(e1, e2); + e1 = kzalloc(e2, GFP_KERNEL); if (!e1) goto alloc_mem_err; @@ expression e1; expression e2; expression e3; @@ - kzalloc(sizeof(e1) * e2, e3) + kcalloc(e2, sizeof(e1), e3) @@ expression e1; expression e2; expression e3; @@ - kzalloc(e1 * sizeof(e2), e3) + kcalloc(e1, sizeof(e2), e3) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25vti4: Enable namespace changingSteffen Klassert
vti4 is now fully namespace aware, so allow namespace changing for vti devices Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25vti4: Check the tunnel endpoints of the xfrm state and the vti interfaceSteffen Klassert
The tunnel endpoints of the xfrm_state we got from the xfrm_lookup must match the tunnel endpoints of the vti interface. This patch ensures this matching. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25vti4: Support inter address family tunneling.Steffen Klassert
With this patch we can tunnel ipv6 traffic via a vti4 interface. A vti4 interface can now have an ipv6 address and ipv6 traffic can be routed via a vti4 interface. The resulting traffic is xfrm transformed and tunneled throuhg ipv4 if matching IPsec policies and states are present. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>