summaryrefslogtreecommitdiff
path: root/net/ipv6/xfrm6_policy.c
AgeCommit message (Collapse)Author
2018-07-03Merge 4.4.139 into android-4.4Greg Kroah-Hartman
Changes in 4.4.139 xfrm6: avoid potential infinite loop in _decode_session6() netfilter: ebtables: handle string from userspace with care ipvs: fix buffer overflow with sync daemon and service atm: zatm: fix memcmp casting net: qmi_wwan: Add Netgear Aircard 779S net/sonic: Use dma_mapping_error() Revert "Btrfs: fix scrub to repair raid6 corruption" tcp: do not overshoot window_clamp in tcp_rcv_space_adjust() Btrfs: make raid6 rebuild retry more usb: musb: fix remote wakeup racing with suspend bonding: re-evaluate force_primary when the primary slave name changes tcp: verify the checksum of the first data segment in a new connection ext4: update mtime in ext4_punch_hole even if no blocks are released ext4: fix fencepost error in check for inode count overflow during resize driver core: Don't ignore class_dir_create_and_add() failure. btrfs: scrub: Don't use inode pages for device replace ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream() ALSA: hda: add dock and led support for HP EliteBook 830 G5 ALSA: hda: add dock and led support for HP ProBook 640 G4 cpufreq: Fix new policy initialization during limits updates via sysfs libata: zpodd: make arrays cdb static, reduces object code size libata: zpodd: small read overflow in eject_tray() libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk w1: mxc_w1: Enable clock before calling clk_get_rate() on it fs/binfmt_misc.c: do not allow offset overflow x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec() m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap() serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version signal/xtensa: Consistenly use SIGBUS in do_unaligned_user usb: do not reset if a low-speed or full-speed device timed out 1wire: family module autoload fails because of upper/lower case mismatch. ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it ASoC: cirrus: i2s: Fix LRCLK configuration ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup lib/vsprintf: Remove atomic-unsafe support for %pCr mips: ftrace: fix static function graph tracing branch-check: fix long->int truncation when profiling branches ipmi:bt: Set the timeout before doing a capabilities check Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader fuse: atomic_o_trunc should truncate pagecache fuse: don't keep dead fuse_conn at fuse_fill_super(). fuse: fix control dir setup and teardown powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG powerpc/ptrace: Fix enforcement of DAWR constraints cpuidle: powernv: Fix promotion from snooze if next state disabled powerpc/fadump: Unregister fadump on kexec down path. ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size of: unittest: for strings, account for trailing \0 in property length field IB/qib: Fix DMA api warning with debug kernel RDMA/mlx4: Discard unknown SQP work requests mtd: cfi_cmdset_0002: Change write buffer to check correct value mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock() mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking. MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume MIPS: io: Add barrier after register read in inX() time: Make sure jiffies_to_msecs() preserves non-zero time periods Btrfs: fix clone vs chattr NODATASUM race iio:buffer: make length types match kfifo types scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread linvdimm, pmem: Preserve read-only setting for pmem devices md: fix two problems with setting the "re-add" device state. ubi: fastmap: Cancel work upon detach UBIFS: Fix potential integer overflow in allocation xfrm: Ignore socket policies when rebuilding hash tables xfrm: skip policies marked as dead while rehashing backlight: as3711_bl: Fix Device Tree node lookup backlight: max8925_bl: Fix Device Tree node lookup backlight: tps65217_bl: Fix Device Tree node lookup mfd: intel-lpss: Program REMAP register in PIO mode perf tools: Fix symbol and object code resolution for vdso32 and vdsox32 perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP perf intel-pt: Fix MTC timing after overflow perf intel-pt: Fix "Unexpected indirect branch" error perf intel-pt: Fix packet decoding of CYC packets media: v4l2-compat-ioctl32: prevent go past max size media: cx231xx: Add support for AverMedia DVD EZMaker 7 media: dvb_frontend: fix locking issues at dvb_frontend_get_event() nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message video: uvesafb: Fix integer overflow in allocation Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID xen: Remove unnecessary BUG_ON from __unbind_from_irq() udf: Detect incorrect directory size Input: elan_i2c_smbus - fix more potential stack buffer overflows Input: elantech - enable middle button of touchpads on ThinkPad P52 Input: elantech - fix V4 report decoding for module with middle key ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210 Btrfs: fix unexpected cow in run_delalloc_nocow spi: Fix scatterlist elements size in spi_map_buf block: Fix transfer when chunk sectors exceeds max dm thin: handle running out of data space vs concurrent discard cdc_ncm: avoid padding beyond end of skb Bluetooth: Fix connection if directed advertising and privacy is used Linux 4.4.139 Change-Id: I93013bedf2ebe3e6a8718972d8854723609963cc Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-03xfrm6: avoid potential infinite loop in _decode_session6()Eric Dumazet
[ Upstream commit d9f92772e8ec388d070752ee8f187ef8fa18621f ] syzbot found a way to trigger an infinitie loop by overflowing @offset variable that has been forced to use u16 for some very obscure reason in the past. We probably want to look at NEXTHDR_FRAGMENT handling which looks wrong, in a separate patch. In net-next, we shall try to use skb_header_pointer() instead of pskb_may_pull(). watchdog: BUG: soft lockup - CPU#1 stuck for 134s! [syz-executor738:4553] Modules linked in: irq event stamp: 13885653 hardirqs last enabled at (13885652): [<ffffffff878009d5>] restore_regs_and_return_to_kernel+0x0/0x2b hardirqs last disabled at (13885653): [<ffffffff87800905>] interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625 softirqs last enabled at (13614028): [<ffffffff84df0809>] tun_napi_alloc_frags drivers/net/tun.c:1478 [inline] softirqs last enabled at (13614028): [<ffffffff84df0809>] tun_get_user+0x1dd9/0x4290 drivers/net/tun.c:1825 softirqs last disabled at (13614032): [<ffffffff84df1b6f>] tun_get_user+0x313f/0x4290 drivers/net/tun.c:1942 CPU: 1 PID: 4553 Comm: syz-executor738 Not tainted 4.17.0-rc3+ #40 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:check_kcov_mode kernel/kcov.c:67 [inline] RIP: 0010:__sanitizer_cov_trace_pc+0x20/0x50 kernel/kcov.c:101 RSP: 0018:ffff8801d8cfe250 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffff8801d88a8080 RBX: ffff8801d7389e40 RCX: 0000000000000006 RDX: 0000000000000000 RSI: ffffffff868da4ad RDI: ffff8801c8a53277 RBP: ffff8801d8cfe250 R08: ffff8801d88a8080 R09: ffff8801d8cfe3e8 R10: ffffed003b19fc87 R11: ffff8801d8cfe43f R12: ffff8801c8a5327f R13: 0000000000000000 R14: ffff8801c8a4e5fe R15: ffff8801d8cfe3e8 FS: 0000000000d88940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffff600400 CR3: 00000001acab3000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: _decode_session6+0xc1d/0x14f0 net/ipv6/xfrm6_policy.c:150 __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:2368 xfrm_decode_session_reverse include/net/xfrm.h:1213 [inline] icmpv6_route_lookup+0x395/0x6e0 net/ipv6/icmp.c:372 icmp6_send+0x1982/0x2da0 net/ipv6/icmp.c:551 icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43 ip6_input_finish+0x14e1/0x1a30 net/ipv6/ip6_input.c:305 NF_HOOK include/linux/netfilter.h:288 [inline] ip6_input+0xe1/0x5e0 net/ipv6/ip6_input.c:327 dst_input include/net/dst.h:450 [inline] ip6_rcv_finish+0x29c/0xa10 net/ipv6/ip6_input.c:71 NF_HOOK include/linux/netfilter.h:288 [inline] ipv6_rcv+0xeb8/0x2040 net/ipv6/ip6_input.c:208 __netif_receive_skb_core+0x2468/0x3650 net/core/dev.c:4646 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4711 netif_receive_skb_internal+0x126/0x7b0 net/core/dev.c:4785 napi_frags_finish net/core/dev.c:5226 [inline] napi_gro_frags+0x631/0xc40 net/core/dev.c:5299 tun_get_user+0x3168/0x4290 drivers/net/tun.c:1951 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1996 call_write_iter include/linux/fs.h:1784 [inline] do_iter_readv_writev+0x859/0xa50 fs/read_write.c:680 do_iter_write+0x185/0x5f0 fs/read_write.c:959 vfs_writev+0x1c7/0x330 fs/read_write.c:1004 do_writev+0x112/0x2f0 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev fs/read_write.c:1109 [inline] __x64_sys_writev+0x75/0xb0 fs/read_write.c:1109 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: syzbot+0053c8...@syzkaller.appspotmail.com Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-11BACKPORT: net: xfrm: support setting an output mark.Lorenzo Colitti
On systems that use mark-based routing it may be necessary for routing lookups to use marks in order for packets to be routed correctly. An example of such a system is Android, which uses socket marks to route packets via different networks. Currently, routing lookups in tunnel mode always use a mark of zero, making routing incorrect on such systems. This patch adds a new output_mark element to the xfrm state and a corresponding XFRMA_OUTPUT_MARK netlink attribute. The output mark differs from the existing xfrm mark in two ways: 1. The xfrm mark is used to match xfrm policies and states, while the xfrm output mark is used to set the mark (and influence the routing) of the packets emitted by those states. 2. The existing mark is constrained to be a subset of the bits of the originating socket or transformed packet, but the output mark is arbitrary and depends only on the state. The use of a separate mark provides additional flexibility. For example: - A packet subject to two transforms (e.g., transport mode inside tunnel mode) can have two different output marks applied to it, one for the transport mode SA and one for the tunnel mode SA. - On a system where socket marks determine routing, the packets emitted by an IPsec tunnel can be routed based on a mark that is determined by the tunnel, not by the marks of the unencrypted packets. - Support for setting the output marks can be introduced without breaking any existing setups that employ both mark-based routing and xfrm tunnel mode. Simply changing the code to use the xfrm mark for routing output packets could xfrm mark could change behaviour in a way that breaks these setups. If the output mark is unspecified or set to zero, the mark is not set or changed. [backport of upstream 077fbac405bfc6d41419ad6c1725804ad4e9887c] Bug: 63589535 Test: https://android-review.googlesource.com/452776/ passes Tested: make allyesconfig; make -j64 Tested: https://android-review.googlesource.com/452776 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Change-Id: I76120fba036e21780ced31ad390faf491ea81e52
2017-10-11UPSTREAM: xfrm: Only add l3mdev oif to dst lookupsDavid Ahern
Subash reported that commit 42a7b32b73d6 ("xfrm: Add oif to dst lookups") broke a wifi use case that uses fib rules and xfrms. The intent of 42a7b32b73d6 was driven by VRFs with IPsec. As a compromise relax the use of oif in xfrm lookups to L3 master devices only (ie., oif is either an L3 master device or is enslaved to a master device). [cherry-pick of upstream 11d7a0bb95eaaba1741bb24a7c3c169c82f09c7b] Bug: 63589535 Change-Id: Ibadb15341f6c6c7077eccfaa2c66b3bb86b251bf Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups") Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-12-22Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2015-12-22 Just one patch to fix dst_entries_init with multiple namespaces. From Dan Streetman. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03xfrm: dst_entries_init() per-net dst_opsDan Streetman
Remove the dst_entries_init/destroy calls for xfrm4 and xfrm6 dst_ops templates; their dst_entries counters will never be used. Move the xfrm dst_ops initialization from the common xfrm/xfrm_policy.c to xfrm4/xfrm4_policy.c and xfrm6/xfrm6_policy.c, and call dst_entries_init and dst_entries_destroy for each net namespace. The ipv4 and ipv6 xfrms each create dst_ops template, and perform dst_entries_init on the templates. The template values are copied to each net namespace's xfrm.xfrm*_dst_ops. The problem there is the dst_ops pcpuc_entries field is a percpu counter and cannot be used correctly by simply copying it to another object. The result of this is a very subtle bug; changes to the dst entries counter from one net namespace may sometimes get applied to a different net namespace dst entries counter. This is because of how the percpu counter works; it has a main count field as well as a pointer to the percpu variables. Each net namespace maintains its own main count variable, but all point to one set of percpu variables. When any net namespace happens to change one of the percpu variables to outside its small batch range, its count is moved to the net namespace's main count variable. So with multiple net namespaces operating concurrently, the dst_ops entries counter can stray from the actual value that it should be; if counts are consistently moved from one net namespace to another (which my testing showed is likely), then one net namespace winds up with a negative dst_ops count while another winds up with a continually increasing count, eventually reaching its gc_thresh limit, which causes all new traffic on the net namespace to fail with -ENOBUFS. Signed-off-by: Dan Streetman <dan.streetman@canonical.com> Signed-off-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-10-30Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2015-10-30 1) The flow cache is limited by the flow cache limit which depends on the number of cpus and the xfrm garbage collector threshold which is independent of the number of cpus. This leads to the fact that on systems with more than 16 cpus we hit the xfrm garbage collector limit and refuse new allocations, so new flows are dropped. On systems with 16 or less cpus, we hit the flowcache limit. In this case, we shrink the flow cache instead of refusing new flows. We increase the xfrm garbage collector threshold to INT_MAX to get the same behaviour, independent of the number of cpus. 2) Fix some unaligned accesses on sparc systems. From Sowmini Varadhan. 3) Fix some header checks in _decode_session4. We may call pskb_may_pull with a negative value converted to unsigened int from pskb_may_pull. This can lead to incorrect policy lookups. We fix this by a check of the data pointer position before we call pskb_may_pull. 4) Reload skb header pointers after calling pskb_may_pull in _decode_session4 as this may change the pointers into the packet. 5) Add a missing statistic counter on inner mode errors. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/ipv6/xfrm6_output.c net/openvswitch/flow_netlink.c net/openvswitch/vport-gre.c net/openvswitch/vport-vxlan.c net/openvswitch/vport.c net/openvswitch/vport.h The openvswitch conflicts were overlapping changes. One was the egress tunnel info fix in 'net' and the other was the vport ->send() op simplification in 'net-next'. The xfrm6_output.c conflicts was also a simplification overlapping a bug fix. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2015-10-22 1) Fix IPsec pre-encap fragmentation for GSO packets. From Herbert Xu. 2) Fix some header checks in _decode_session6. We skip the header informations if the data pointer points already behind the header in question for some protocols. This is because we call pskb_may_pull with a negative value converted to unsigened int from pskb_may_pull in this case. Skipping the header informations can lead to incorrect policy lookups. From Mathias Krause. 3) Allow to change the replay threshold and expiry timer of a state without having to set other attributes like replay counter and byte lifetime. Changing these other attributes may break the SA. From Michael Rossberg. 4) Fix pmtu discovery for local generated packets. We may fail dispatch to the inner address family. As a reault, the local error handler is not called and the mtu value is not reported back to userspace. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07net: Fix vti use case with oif in dst lookups for IPv6David Ahern
It occurred to me yesterday that 741a11d9e4103 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set") means that xfrm6_dst_lookup needs the FLOWI_FLAG_SKIP_NH_OIF flag set. This latest commit causes the oif to be considered in lookups which is known to break vti. This explains why 58189ca7b274 did not the IPv6 change at the time it was submitted. Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07net: Fix vti use case with oif in dst lookups for IPv6David Ahern
It occurred to me yesterday that 741a11d9e4103 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set") means that xfrm6_dst_lookup needs the FLOWI_FLAG_SKIP_NH_OIF flag set. This latest commit causes the oif to be considered in lookups which is known to break vti. This explains why 58189ca7b274 did not the IPv6 change at the time it was submitted. Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Replace vrf_master_ifindex{, _rcu} with l3mdev equivalentsDavid Ahern
Replace calls to vrf_master_ifindex_rcu and vrf_master_ifindex with either l3mdev_master_ifindex_rcu or l3mdev_master_ifindex. The pattern: oif = vrf_master_ifindex(dev) ? : dev->ifindex; is replaced with oif = l3mdev_fib_oif(dev); And remove the now unused vrf macros. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29xfrm: Let the flowcache handle its size by default.Steffen Klassert
The xfrm flowcache size is limited by the flowcache limit (4096 * number of online cpus) and the xfrm garbage collector threshold (2 * 32768), whatever is reached first. This means that we can hit the garbage collector limit only on systems with more than 16 cpus. On such systems we simply refuse new allocations if we reach the limit, so new flows are dropped. On syslems with 16 or less cpus, we hit the flowcache limit. In this case, we shrink the flow cache instead of refusing new flows. We increase the xfrm garbage collector threshold to INT_MAX to get the same behaviour, independent of the number of cpus. The xfrm garbage collector threshold can still be set below the flowcache limit to reduce the memory usage of the flowcache. Tested-by: Dan Streetman <dan.streetman@canonical.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-09-14xfrm6: Fix ICMPv6 and MH header checks in _decode_session6Mathias Krause
Ensure there's enough data left prior calling pskb_may_pull(). If skb->data was already advanced, we'll call pskb_may_pull() with a negative value converted to unsigned int -- leading to a huge positive value. That won't matter in practice as pskb_may_pull() will likely fail in this case, but it leads to underflow reports on kernels handling such kind of over-/underflows, e.g. a PaX enabled kernel instrumented with the size_overflow plugin. Reported-by: satmd <satmd@lain.at> Reported-and-tested-by: Marcin Jurkowski <marcin1j@gmail.com> Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: PaX Team <pageexec@freemail.hu> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-08-25xfrm: Use VRF master index if output device is enslavedDavid Ahern
Directs route lookups to VRF table. Compiles out if NET_VRF is not enabled. With this patch able to successfully bring up ipsec tunnels in VRFs, even with duplicate network configuration. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11xfrm: Add oif to dst lookupsDavid Ahern
Rules can be installed that direct route lookups to specific tables based on oif. Plumb the oif through the xfrm lookups so it gets set in the flow struct and passed to the resolver routines. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-05-25ipv6: Add rt6_get_cookie() functionMartin KaFai Lau
Instead of doing the rt6->rt6i_node check whenever we need to get the route's cookie. Refactor it into rt6_get_cookie(). It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also percpu rt6_info later. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-01ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peerMartin KaFai Lau
_rt6i_peer is no longer needed after the last patch, 'ipv6: Stop rt6_info from using inet_peer's metrics'. DST_METRICS_FORCE_OVERWRITE is added by commit e5fd387ad5b3 ("ipv6: do not overwrite inetpeer metrics prematurely"). Since inetpeer is no longer used for metrics, this bit is also not needed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31xfrm: simplify xfrm_address_t useJiri Benc
In many places, the a6 field is typecasted to struct in6_addr. As the fields are in union anyway, just add in6_addr type to the union and get rid of the typecasting. Modifying the uapi header is okay, the union has still the same size. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31ipv6: coding style: comparison for equality with NULLIan Morris
The ipv6 code uses a mixture of coding styles. In some instances check for NULL pointer is done as x == NULL and sometimes as !x. !x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-09net: Remove protocol from struct dst_opsEric W. Biederman
After my change to neigh_hh_init to obtain the protocol from the neigh_table there are no more users of protocol in struct dst_ops. Remove the protocol field from dst_ops and all of it's initializers. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06xfrm6: Fix a offset value for network header in _decode_session6Hajime Tazaki
When a network-layer header has multiple IPv6 extension headers, then offset for mobility header goes wrong. This regression breaks an xfrm policy lookup for a particular receive packet. Binding update packets of Mobile IPv6 are all discarded without this fix. Fixes: de3b7a06dfe1 ("xfrm6: Fix transport header offset in _decode_session6.") Signed-off-by: Hajime Tazaki <tazaki@sfc.wide.ad.jp> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-12-08xfrm6: Fix the nexthdr offset in _decode_session6.Steffen Klassert
xfrm_decode_session() was originally designed for the usage in the receive path where the correct nexthdr offset is stored in IP6CB(skb)->nhoff. Over time this function spread to code that is used in the output path (netfilter, vti) where IP6CB(skb)->nhoff is not set. As a result, we get a wrong nexthdr and the upper layer flow informations are wrong. This can leed to incorrect policy lookups. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-12-08xfrm6: Fix transport header offset in _decode_session6.Steffen Klassert
skb->transport_header might not be valid when we do a reverse decode because the ipv6 tunnel error handlers don't update it to the inner transport header. This leads to a wrong offset calculation and to wrong layer 4 informations. We fix this by using the size of the ipv6 header as the first offset. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-10-22xfrm6: fix a potential use after free in xfrm6_policy.cLi RongQing
pskb_may_pull() maybe change skb->data and make nh and exthdr pointer oboslete, so recompute the nd and exthdr Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-24ipv6: White-space cleansing : Line LayoutsIan Morris
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. A number of items are addressed in this patch: * Multiple spaces converted to tabs * Spaces before tabs removed. * Spaces in pointer typing cleansed (char *)foo etc. * Remove space after sizeof * Ensure spacing around comparators such as if statements. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14xfrm6: Add IPsec protocol multiplexerSteffen Klassert
This patch adds an IPsec protocol multiplexer for ipv6. With this it is possible to add alternative protocol handlers, as needed for IPsec virtual tunnel interfaces. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-11-01xfrm: Fix null pointer dereference when decoding sessionsSteffen Klassert
On some codepaths the skb does not have a dst entry when xfrm_decode_session() is called. So check for a valid skb_dst() before dereferencing the device interface index. We use 0 as the device index if there is no valid skb_dst(), or at reverse decoding we use skb_iif as device interface index. Bug was introduced with git commit bafd4bd4dc ("xfrm: Decode sessions with output interface."). Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-28xfrm: Increase the garbage collector thresholdSteffen Klassert
With the removal of the routing cache, we lost the option to tweak the garbage collector threshold along with the maximum routing cache size. So git commit 703fb94ec ("xfrm: Fix the gc threshold value for ipv4") moved back to a static threshold. It turned out that the current threshold before we start garbage collecting is much to small for some workloads, so increase it from 1024 to 32768. This means that we start the garbage collector if we have more than 32768 dst entries in the system and refuse new allocations if we are above 65536. Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-16xfrm: Decode sessions with output interface.Steffen Klassert
The output interface matching does not work on forward policy lookups, the output interface of the flowi is always 0. Fix this by setting the output interface when we decode the session. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-05-11xfrm6: release dev before returning errorCong Wang
We forget to call dev_put() on error path in xfrm6_fill_dst(), its caller doesn't handle this. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06xfrm: make gc_thresh configurable in all namespacesMichal Kubecek
The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh sysctl but currently only in init_net, other namespaces always use the default value. This can substantially limit the number of IPsec tunnels that can be effectively used. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-01-17ipv6: Complete neighbour entry removal from dst_entry.YOSHIFUJI Hideaki / 吉藤英明
CC: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-16xfrm6: Remove commented out function call to xfrm6_input_finiSteffen Klassert
xfrm6_input_fini() is not in the tree since more than 10 years, so remove the commented out function call. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2012-11-15xfrm: Use a static gc threshold value for ipv6Steffen Klassert
Unlike ipv4 did, ipv6 does not handle the maximum number of cached routes dynamically. So no need to try to handle the IPsec gc threshold value dynamically. This patch sets the IPsec gc threshold value back to 1024 routes, as it is for non-IPsec routes. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2012-11-01ipv6: use IS_ENABLED()Amerigo Wang
#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE) can be replaced by #if IS_ENABLED(CONFIG_FOO) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-20net: ipv6: fix oops in inet_putpeer()Patrick McHardy
Commit 97bab73f (inet: Hide route peer accesses behind helpers.) introduced a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not initialized, causing a false positive result from inetpeer_ptr_is_peer(), which in turn causes a NULL pointer dereference in inet_putpeer(). Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0 EIP is at inet_putpeer+0xe/0x16 EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641 ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4 DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28 f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4 f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8 [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb [<c038d5f7>] dst_destroy+0x1d/0xa4 [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36 [<c0396870>] flow_cache_gc_task+0x85/0x9f [<c0142d2b>] process_one_work+0x122/0x441 [<c043feb5>] ? apic_timer_interrupt+0x31/0x38 [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b [<c0143e2d>] worker_thread+0x113/0x3cc Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to properly initialize the dst's peer pointer. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}()David S. Miller
This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Add redirect support to all protocol icmp error handlers.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: Store route neighbour in rt6_info struct.David S. Miller
This makes for a simplified conversion away from dst_get_neighbour*(). All code outside of ipv6 will use neigh lookups via dst_neigh_lookup*(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11inet: Hide route peer accesses behind helpers.David S. Miller
We encode the pointer(s) into an unsigned long with one state bit. The state bit is used so we can store the inetpeer tree root to use when resolving the peer later. Later the peer roots will be per-FIB table, and this change works to facilitate that. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-20net: Convert all sysctl registrations to register_net_sysctlEric W. Biederman
This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22net: remove ipv6_addr_copy()Alexey Dobriyan
C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-22inet: constify ip headers and in6_addrEric Dumazet
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12net: Put fl6_* macros to struct flowi6 and use them again.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12ipv6: Convert to use flowi6 where applicable.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12net: Use flowi4 and flowi6 in xfrm layer.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12net: Make flowi ports AF dependent.David S. Miller
Create two sets of port member accessors, one set prefixed by fl4_* and the other prefixed by fl6_* This will let us to create AF optimal flow instances. It will work because every context in which we access the ports, we have to be fully aware of which AF the flowi is anyways. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller
I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>