summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2009-01-21inet: Allowing more than 64k connections and heavily optimize bind(0) time.Evgeniy Polyakov
With simple extension to the binding mechanism, which allows to bind more than 64k sockets (or smaller amount, depending on sysctl parameters), we have to traverse the whole bind hash table to find out empty bucket. And while it is not a problem for example for 32k connections, bind() completion time grows exponentially (since after each successful binding we have to traverse one bucket more to find empty one) even if we start each time from random offset inside the hash table. So, when hash table is full, and we want to add another socket, we have to traverse the whole table no matter what, so effectivelly this will be the worst case performance and it will be constant. Attached picture shows bind() time depending on number of already bound sockets. Green area corresponds to the usual binding to zero port process, which turns on kernel port selection as described above. Red area is the bind process, when number of reuse-bound sockets is not limited by 64k (or sysctl parameters). The same exponential growth (hidden by the green area) before number of ports reaches sysctl limit. At this time bind hash table has exactly one reuse-enbaled socket in a bucket, but it is possible that they have different addresses. Actually kernel selects the first port to try randomly, so at the beginning bind will take roughly constant time, but with time number of port to check after random start will increase. And that will have exponential growth, but because of above random selection, not every next port selection will necessary take longer time than previous. So we have to consider the area below in the graph (if you could zoom it, you could find, that there are many different times placed there), so area can hide another. Blue area corresponds to the port selection optimization. This is rather simple design approach: hashtable now maintains (unprecise and racely updated) number of currently bound sockets, and when number of such sockets becomes greater than predefined value (I use maximum port range defined by sysctls), we stop traversing the whole bind hash table and just stop at first matching bucket after random start. Above limit roughly corresponds to the case, when bind hash table is full and we turned on mechanism of allowing to bind more reuse-enabled sockets, so it does not change behaviour of other sockets. Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Tested-by: Denys Fedoryschenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Debugging functions for feature negotiationGerrit Renker
Since all feature-negotiation processing now takes place in feat.c, functions for producing verbose debugging output are concentrated there. New functions to print out values, entry records, and options are provided, and also a macro is defined to not always have the function name in the output line. Thanks a lot to Wei Yongjun and Giuseppe Galeota for help and discussion with an earlier revision of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Initialisation and type-checking of feature sysctlsGerrit Renker
This patch takes care of initialising and type-checking sysctls related to feature negotiation. Type checking is important since some of the sysctls now directly impact the feature-negotiation process. The sysctls are initialised with the known default values for each feature. For the type-checking the value constraints from RFC 4340 are used: * Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes), tested and confirmed that it works up to 4294967295 - for Gbps speed; * Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer); * CCIDs are between 0 .. 255; * request_retries, retries1, retries2 also between 0..255 for good measure; * tx_qlen is checked to be non-negative; * sync_ratelimit remains as before. Notes: ------ 1. Die s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c. 2. As pointed out by Arnaldo, the pattern of type-checking repeats itself in other places, sometimes with exactly the same kind of definitions (e.g. "static int zero;"). It may be a good idea (kernel janitors?) to consolidate type checking. For the sake of keeping the changeset small and in order not to affect other subsystems, I have not strived to generalise here. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Implement both feature-local and feature-remote Sequence Window featureGerrit Renker
This adds full support for local/remote Sequence Window feature, from which the * sequence-number-validity (W) and * acknowledgment-number-validity (W') windows derive as specified in RFC 4340, 7.5.3. Specifically, the following is contained in this patch: * integrated new socket fields into dccp_sk; * updated the update_gsr/gss routines with regard to these fields; * updated handler code: the Sequence Window feature is located at the TX side, so the local feature is meant if the handler-rx flag is false; * the initialisation of `rcv_wnd' in reqsk is removed, since - rcv_wnd is not used by the code anywhere; - sequence number checks are not done in the LISTEN state (cf. 7.5.3); - dccp_check_req checks the Ack number validity more rigorously; * the `struct dccp_minisock' became empty and is now removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Initialisation framework for feature negotiationGerrit Renker
This initialises feature negotiation from two tables, which are in turn are initialised from sysctls. As a novel feature, specifics of the implementation (e.g. that short seqnos and ECN are not yet available) are advertised for robustness. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21appletalk: remove unneeded stubsStephen Hemminger
With net_device_ops if set_mac_address is null, then error is -EOPNOTSUPPORTED. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21rose: convert to network_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21rose: convert to internal net_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21netrom: convert to net_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21netrom: convert to internal net_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21lec: convert to net_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21lec: convert to internal network_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21clip: convert to internal network_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21br2684: convert to net_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21atm: br2684 internal statsStephen Hemminger
Now that stats are in net_device, use them. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21netfilter: ctnetlink: fix scheduling while atomicPatrick McHardy
Caused by call to request_module() while holding nf_conntrack_lock. Reported-and-tested-by: Kövesdi György <kgy@teledigit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix merging of paged packetsHerbert Xu
The previous fix to paged packets broke the merging because it reset the skb->len before we added it to the merged packet. This wasn't detected because it simply resulted in the truncation of the packet while the missing bit is subsequently retransmitted. The fix is to store skb->len before we clobber it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix error handling on extremely short fragsHerbert Xu
When a frag is shorter than an Ethernet header, we'd return a zeroed packet instead of aborting. This patch fixes that. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix handling of complete checksums in IPv6Herbert Xu
We need to perform skb_postpull_rcsum after pulling the IPv6 header in order to maintain the correctness of the complete checksum. This patch also adds a missing iph reload after pulling. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20NET: net_namespace, fix lock imbalanceJiri Slaby
register_pernet_gen_subsys omits mutex_unlock in one fail path. Fix it. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-01-20Revert "xfrm: For 32/64 compatability wrt. xfrm_usersa_info"David S. Miller
This reverts commit fc8c7dc1b29560c016a67a34ccff32a712b5aa86. As indicated by Jiri Klimes, this won't work. These numbers are not only used the size validation, they are also used to locate attributes sitting after the message. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-19net: Fix data corruption when splicing from sockets.Jarek Poplawski
The trick in socket splicing where we try to convert the skb->data into a page based reference using virt_to_page() does not work so well. The idea is to pass the virt_to_page() reference via the pipe buffer, and refcount the buffer using a SKB reference. But if we are splicing from a socket to a socket (via sendpage) this doesn't work. The from side processing will grab the page (and SKB) references. The sendpage() calls will grab page references only, return, and then the from side processing completes and drops the SKB ref. The page based reference to skb->data is not enough to keep the kmalloc() buffer backing it from being reused. Yet, that is all that the socket send side has at this point. This leads to data corruption if the skb->data buffer is reused by SLAB before the send side socket actually gets the TX packet out to the device. The fix employed here is to simply allocate a page and copy the skb->data bytes into that page. This will hurt performance, but there is no clear way to fix this properly without a copy at the present time, and it is important to get rid of the data corruption. With fixes from Herbert Xu. Tested-by: Willy Tarreau <w@1wt.eu> Foreseen-by: Changli Gao <xiaosuo@gmail.com> Diagnosed-by: Willy Tarreau <w@1wt.eu> Reported-by: Willy Tarreau <w@1wt.eu> Fixed-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-19net: Add debug info to track down GSO checksum bugHerbert Xu
I'm trying to track down why people're hitting the checksum warning in skb_gso_segment. As the problem seems to be hitting lots of people and I can't reproduce it or locate the bug, here is a patch to print out more details which hopefully should help us to track this down. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-19net/9p: fid->fid is used uninitializedRoel Kluin
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-16cfg80211: Fix parsed country IE info for 5 GHzLuis R. Rodriguez
The country IE number of channels on 5 GHz specifies the number of 5 GHz channels, not the number of sequential channel numbers. For example, if in a country IEs if the first channel given is 36 and the number of channels passed is 4 then the individual channel numbers defined for the 5 GHz PHY by these parameters are: 36, 40, 44, 48 not: 36, 37, 38, 39 See: http://tinyurl.com/11d-clarification Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-16cfg80211: Fix regression with 11d on bandsLuis R. Rodriguez
This fixes a regression on disallowing bands introduced with the new 802.11d support. The issue is that IEEE-802.11 allows APs to send a subset of what a country regulatory domain defines. This was clarified in this document: http://tinyurl.com/11d-clarification As such it is possible, and this is what is done in practice, that a single band 2.4 GHz AP will only send 2.4 GHz band regulatory information through the 802.11 country information element and then the current intersection with what CRDA provided yields a regulatory domain with no 5 GHz information -- even though that country may actually allow 5 GHz operation. We correct this by only applying the intersection rules on a channel if the the intersection yields a regulatory rule on the same band the channel is on. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-16cfg80211: make handle_band() and handle_channel() wiphy specificLuis R. Rodriguez
This allows us to make more wiphy specific judgements when handling the channels later on. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-16mac80211: more kernel-doc fixesRandy Dunlap
Fix (delete) more mac80211 kernel-doc: Warning(linux-2.6.28-git13//include/net/mac80211.h:375): Excess struct/union/enum/typedef member 'retry_count' description in 'ieee80211_tx_info' Warning(linux-2.6.28-git13//net/mac80211/sta_info.h:308): Excess struct/union/enum/typedef member 'last_txrate' description in 'sta_info' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (95 commits) b44: GFP_DMA skb should not escape from driver korina: do not use IRQF_SHARED with IRQF_DISABLED korina: do not stop queue here korina: fix handling tx_chain_tail korina: do tx at the right position korina: do schedule napi after testing for it korina: rework korina_rx() for use with napi korina: disable napi on close and restart korina: reset resource buffer size to 1536 korina: fix usage of driver_data bnx2x: First slow path interrupt race bnx2x: MTU Filter bnx2x: Indirection table initialization index bnx2x: Missing brackets bnx2x: Fixing the doorbell size bnx2x: Endianness issues bnx2x: VLAN tagged packets without VLAN offload bnx2x: Protecting the link change indication bnx2x: Flow control updated before reporting the link bnx2x: Missing mask when calculating flow control ...
2009-01-159p: disallow RDMA if RDMA CM isn't availableRoland Dreier
If INET=y and INFINIBAND=y, but IPV6=m then INFINIBAND_ADDR_TRANS is set to n and the RDMA CM functions rdma_connect() et al are not built. However, the current config dependencies allow NET_9P_RDMA to be selected in this, which leads to a build failure. Fix this by adding a dependency on INFINIBAND_ADDR_TRANS to disallow NET_9P_RDMA in this case. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-14can: fix slowpath issue in hrtimer callback functionOliver Hartkopp
Due to the loopback functionality in can_send() we can not invoke it from hardirq context which was done inside the bcm_tx_timeout_handler() hrtimer callback: [ 700.361154] [<c012228c>] warn_slowpath+0x80/0xb6 [ 700.361163] [<c013d559>] valid_state+0x125/0x136 [ 700.361171] [<c013d858>] mark_lock+0x18e/0x332 [ 700.361180] [<c013e300>] __lock_acquire+0x12e/0xb1e [ 700.361189] [<f8ab5915>] bcm_tx_timeout_handler+0x0/0xbc [can_bcm] [ 700.361198] [<c031e20a>] dev_queue_xmit+0x191/0x479 [ 700.361206] [<c01262a7>] __local_bh_disable+0x2b/0x64 [ 700.361213] [<c031e20a>] dev_queue_xmit+0x191/0x479 [ 700.361225] [<f8aa69a1>] can_send+0xd7/0x11a [can] [ 700.361235] [<f8ab522b>] bcm_can_tx+0x9d/0xd9 [can_bcm] [ 700.361245] [<f8ab597f>] bcm_tx_timeout_handler+0x6a/0xbc [can_bcm] [ 700.361255] [<f8ab5915>] bcm_tx_timeout_handler+0x0/0xbc [can_bcm] [ 700.361263] [<c0134143>] __run_hrtimer+0x5a/0x86 [ 700.361273] [<f8ab5915>] bcm_tx_timeout_handler+0x0/0xbc [can_bcm] [ 700.361282] [<c0134a50>] hrtimer_interrupt+0xb9/0x110 This patch moves the rest of the functionality from the hrtimer callback to the already existing tasklet to fix this slowpath problem. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14net: Add init_dummy_netdev() and fix EMAC driver using itBenjamin Herrenschmidt
This adds an init_dummy_netdev() function that gets a network device structure (allocation and lifetime entirely under caller's control) and initialize the minimum amount of fields so it can be used to schedule NAPI polls without registering a full blown interface. This is to be used by drivers that need to tie several hardware interfaces to a single NAPI poll scheduler due to HW limitations. It also updates the ibm_newemac driver to use that, this fixing the oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add() Symbol is exported GPL only a I don't think we want binary drivers doing that sort of acrobatics (if we want them at all). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14gso: Ensure that the packet is long enoughHerbert Xu
When we get a GSO packet from an untrusted source, we need to ensure that it is sufficiently long so that we don't end up crashing. Based on discovery and patch by Ian Campbell. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14gro: Fix page ref count for skbs freed normallyHerbert Xu
When an skb with page frags is merged into an existing one, we cannibalise its reference count. This is OK when the skb is reused because we set nr_frags to zero in that case. However, for the case where the skb is freed through kfree_skb, we didn't clear nr_frags which causes the page to be freed prematurely. This is fixed by moving the skb resetting into skb_gro_receive. Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14xfrm: For 32/64 compatability wrt. xfrm_usersa_infoDavid S. Miller
Reported by Jiri Klimes. Fix suggested by Patrick McHardy. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14gro: Check for GSO packets and packets with frag_listHerbert Xu
As GRO cannot be applied to packets with frag_list we need to make sure that we reject such packets if they are fed to us, e.g., through a tunnel device. Also there is no point in applying GRO on GSO packets so they too should be rejected. This allows GRO to be used in virtio-net which may produce GSO packets directly but may still benefit from GRO if the other end of it doesn't support GSO. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14[CVE-2009-0029] System call wrappers part 22Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] System call wrappers part 21Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] System call wrappers part 07Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-13ipv6: Fix fib6_dump_table walker leakHerbert Xu
When a fib6 table dump is prematurely ended, we won't unlink its walker from the list. This causes all sorts of grief for other users of the list later. Reported-by: Chris Caputo <ccaputo@alt.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-13tcp: splice as many packets as possible at onceWilly Tarreau
As spotted by Willy Tarreau, current splice() from tcp socket to pipe is not optimal. It processes at most one segment per call. This results in low performance and very high overhead due to syscall rate when splicing from interfaces which do not support LRO. Willy provided a patch inside tcp_splice_read(), but a better fix is to let tcp_read_sock() process as many segments as possible, so that tcp_rcv_space_adjust() and tcp_cleanup_rbuf() are called less often. With this change, splice() behaves like tcp_recvmsg(), being able to consume many skbs in one system call. With typical 1460 bytes of payload per frame, that means splice(SPLICE_F_NONBLOCK) can return 16*1460 = 23360 bytes. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-13Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-01-13mac80211: fix "‘ret’ may be used uninitialized" warningJohn W. Linville
net/mac80211/ht.c: In function ‘ieee80211_start_tx_ba_session’: net/mac80211/ht.c:472: warning: ‘ret’ may be used uninitialized in this function Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-12pkt_sched: sch_htb: Break all htb_do_events() after 2 jiffiesJarek Poplawski
Currently htb_do_events() breaks events recounting for a level after 2 jiffies, but there is no reason to repeat this for next levels and increase delays even more (with softirqs disabled). htb_dequeue_tree() can add to this too, btw. In such a case q->now time is invalid anyway. Thanks to Patrick McHardy for spotting an error around earlier version of this patch. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-12pkt_sched: sch_htb: Consider used jiffies in htb_do_events()Jarek Poplawski
Next event time should consider jiffies used for recounting. Otherwise qdisc_watchdog_schedule() triggers hrtimer immediately with the event in the past, and may cause very high ksoftirqd cpu usage (if highres is on). There is also removed checking "event" for zero in htb_dequeue(): it's always true in this place. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-12netfilter 08/09: xt_time: print timezone for user informationJan Engelhardt
netfilter: xt_time: print timezone for user information Let users have a way to figure out if their distro set the kernel timezone at all. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-12netfilter 07/09: simplify nf_conntrack_alloc() error handlingJulia Lawall
nf_conntrack_alloc cannot return NULL, so there is no need to check for NULL before using the value. I have also removed the initialization of ct to NULL in nf_conntrack_alloc, since the value is never used, and since perhaps it might lead one to think that return ct at the end might return NULL. The semantic patch that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match exists@ expression x, E; position p1,p2; statement S1, S2; @@ x@p1 = nf_conntrack_alloc(...) ... when != x = E ( if (x@p2 == NULL || ...) S1 else S2 | if (x@p2 == NULL && ...) S1 else S2 ) @other_match exists@ expression match.x, E1, E2; position p1!=match.p1,match.p2; @@ x@p1 = E1 ... when != x = E2 x@p2 @ script:python depends on !other_match@ p1 << match.p1; p2 << match.p2; @@ print "%s: call to nf_conntrack_alloc %s bad test %s" % (p1[0].file,p1[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-12netfilter 06/09: nf_conntrack: fix ICMP/ICMPv6 timeout sysctls on big-endianPatrick McHardy
An old bug crept back into the ICMP/ICMPv6 conntrack protocols: the timeout values are defined as unsigned longs, the sysctl's maxsize is set to sizeof(unsigned int). Use unsigned int for the timeout values as in the other conntrack protocols. Reported-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-12netfilter 05/09: ebtables: fix inversion in match codeJan Engelhardt
Commit 8cc784ee (netfilter: change return types of match functions for ebtables extensions) broke ebtables matches by inverting the sense of match/nomatch. Reported-by: Matt Cross <matthltc@us.ibm.com> Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>