summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-09-20ANDROID: fs: FS tracepoints to track IO.Mohan Srinivasan
Adds tracepoints in ext4/f2fs/mpage to track readpages/buffered write()s. This allows us to track files that are being read/written to PIDs. Change-Id: I26bd36f933108927d6903da04d8cb42fd9c3ef3d Signed-off-by: Mohan Srinivasan <srmohan@google.com>
2016-09-20sched/walt: Drop arch-specific timer accessChris Redpath
On at least one platform, occasionally the timer providing the wallclock was able to be reset/go backwards for at least some time after wakeup. Accept that this might happen and warn the first time, but otherwise just carry on. Change-Id: Id3164477ba79049561af7f0889cbeebc199ead4e Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2016-09-20ANDROID: fiq_debugger: Pass task parameter to unwind_frame()Jeff Vander Stoep
Fixes: fe13f95b7200 ("arm64: pass a task parameter to unwind_frame()") Bug: 30369029 Patchset: rework-pagetable Change-Id: I9a4ab50ef61532d27282f189f063c938c196ec08 Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
2016-09-20eas/sched/fair: Fixing comments in find_best_target.Srinath Sridharan
Change-Id: I83f5b9887e98f9fdb81318cde45408e7ebfc4b13 Signed-off-by: Srinath Sridharan <srinathsr@google.com>
2016-09-20input: keyreset: switch to orderly_rebootEric Ernst
Prior restart function would make a call to sys_sync and then execute a kernel reset. Rather than call the sync directly, thus necessitating this driver to be builtin, call orderly_reboot, which will take care of the file system sync. Note: since CONFIG_INPUT Kconfig is tristate, this driver can be built as module, despite being marked bool. Signed-off-by: Eric Ernst <eric.ernst@linux.intel.com>
2016-09-20UPSTREAM: tun: fix transmit timestamp supportSoheil Hassas Yeganeh
Instead of using sock_tx_timestamp, use skb_tx_timestamp to record software transmit timestamp of a packet. sock_tx_timestamp resets and overrides the tx_flags of the skb. The function is intended to be called from within the protocol layer when creating the skb, not from a device driver. This is inconsistent with other drivers and will cause issues for TCP. In TCP, we intend to sample the timestamps for the last byte for each sendmsg/sendpage. For that reason, tcp_sendmsg calls tcp_tx_timestamp only with the last skb that it generates. For example, if a 128KB message is split into two 64KB packets we want to sample the SND timestamp of the last packet. The current code in the tun driver, however, will result in sampling the SND timestamp for both packets. Also, when the last packet is split into smaller packets for retranmission (see tcp_fragment), the tun driver will record timestamps for all of the retransmitted packets and not only the last packet. Change-Id: If7458ab31de52aa15a12364b6c1ac2a8f93f17a7 Fixes: eda297729171 (tun: Support software transmit time stamping.) Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Francis Yan <francisyyan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18BACKPORT: arm64: mm: create new fine-grained mappings at bootMark Rutland
At boot we may change the granularity of the tables mapping the kernel (by splitting or making sections). This may happen when we create the linear mapping (in __map_memblock), or at any point we try to apply fine-grained permissions to the kernel (e.g. fixup_executable, mark_rodata_ro, fixup_init). Changing the active page tables in this manner may result in multiple entries for the same address being allocated into TLBs, risking problems such as TLB conflict aborts or issues derived from the amalgamation of TLB entries. Generally, a break-before-make (BBM) approach is necessary to avoid conflicts, but we cannot do this for the kernel tables as it risks unmapping text or data being used to do so. Instead, we can create a new set of tables from scratch in the safety of the existing mappings, and subsequently migrate over to these using the new cpu_replace_ttbr1 helper, which avoids the two sets of tables being active simultaneously. To avoid issues when we later modify permissions of the page tables (e.g. in fixup_init), we must create the page tables at a granularity such that later modification does not result in splitting of tables. This patch applies this strategy, creating a new set of fine-grained page tables from scratch, and safely migrating to them. The existing fixmap and kasan shadow page tables are reused in the new fine-grained tables. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 068a17a5805dfbca4bbf03e664ca6b19709cc7a8) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I7ddaa296b94106846ebf8392d91186df8673df64
2016-09-18BACKPORT: arm64: ensure _stext and _etext are page-alignedMark Rutland
Currently we have separate ALIGN_DEBUG_RO{,_MIN} directives to align _etext and __init_begin. While we ensure that __init_begin is page-aligned, we do not provide the same guarantee for _etext. This is not problematic currently as the alignment of __init_begin is sufficient to prevent issues when we modify permissions. Subsequent patches will assume page alignment of segments of the kernel we wish to map with different permissions. To ensure this, move _etext after the ALIGN_DEBUG_RO_MIN for the init section. This renders the prior ALIGN_DEBUG_RO irrelevant, and hence it is removed. Likewise, upgrade to ALIGN_DEBUG_RO_MIN(PAGE_SIZE) for _stext. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit fca082bfb543ccaaff864fc0892379ccaa1711cd) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: If1a829aa5c363f2c76eb1a18209c3c00ee3ffd0a
2016-09-18UPSTREAM: arm64: mm: allow passing a pgdir to alloc_init_*Mark Rutland
To allow us to initialise pgdirs which are fixmapped, allow explicitly passing a pgdir rather than an mm. A new __create_pgd_mapping function is added for this, with existing __create_mapping callers migrated to this. The mm argument was previously only used at the top level. Now that it is redundant at all levels, it is removed. To indicate its new found similarity to alloc_init_{pud,pmd,pte}, __create_mapping is renamed to init_pgd. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 11509a306bb6ea595878b2d246d2d56b1783e040) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I554f90b986a0fce6c96c0a0a9da1d6e61602d0a9
2016-09-18UPSTREAM: arm64: mm: allocate pagetables anywhereMark Rutland
Now that create_mapping uses fixmap slots to modify pte, pmd, and pud entries, we can access page tables anywhere in physical memory, regardless of the extent of the linear mapping. Given that, we no longer need to limit memblock allocations during page table creation, and can leave the limit as its default MEMBLOCK_ALLOC_ANYWHERE. We never add memory which will fall outside of the linear map range given phys_offset and MAX_MEMBLOCK_ADDR are configured appropriately, so any tables we create will fall in the linear map of the final tables. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit cdef5f6e9e0e5ee397759b664a9f875ff59ccf01) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I9b6487491f47351303a147147c3d274da20def59
2016-09-18UPSTREAM: arm64: mm: use fixmap when creating page tablesMark Rutland
As a preparatory step to allow us to allocate early page tables from unmapped memory using memblock_alloc, modify the __create_mapping callees to map and unmap the tables they modify using fixmap entries. All but the top-level pgd initialisation is performed via the fixmap. Subsequent patches will inject the pgd physical address, and migrate to using the FIX_PGD slot. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit f4710445458c0a1bd1c3c014ada2e7d7dc7b882f) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I71d2fb02d009915b5aab7f3467c4a3c658e898de
2016-09-18UPSTREAM: arm64: mm: add functions to walk tables in fixmapMark Rutland
As a preparatory step to allow us to allocate early page tables from unmapped memory using memblock_alloc, add new p??_{set,clear}_fixmap* functions which can be used to walk page tables outside of the linear mapping by using fixmap slots. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 961faac114819a01e627fe9c9c82b830bb3849d4) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I58a0f54d2c637ef0fda8e0b9187ade969aecd347
2016-09-18UPSTREAM: arm64: mm: add __{pud,pgd}_populateMark Rutland
We currently have __pmd_populate for creating a pmd table entry given the physical address of a pte, but don't have equivalents for the pud or pgd levels of table. To enable us to manipulate tables which are mapped outside of the linear mapping (where we have a PA, but not a linear map VA), it is useful to have these functions. This patch adds __{pud,pgd}_populate. As these should not be called when the kernel uses folded {pmd,pud}s, in these cases they expand to BUILD_BUG(). So long as the appropriate checks are made on the {pud,pgd} entry prior to attempting population, these should be optimized out at compile time. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 1e531cce68c92b46c7d29f36a72f9a3e5886678f) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I4bb3b67d0e8b63dfbd1292a93bdde42ead23a5c2
2016-09-18UPSTREAM: arm64: mm: avoid redundant __pa(__va(x))Mark Rutland
When we "upgrade" to a section mapping, we free any table we made redundant by giving it back to memblock. To get the PA, we acquire the physical address and convert this to a VA, then subsequently convert this back to a PA. This works currently, but will not work if the tables are not accessed via linear map VAs (e.g. is we use fixmap slots). This patch uses {pmd,pud}_page_paddr to acquire the PA. This avoids the __pa(__va()) round trip, saving some work and avoiding reliance on the linear mapping. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 316b39db06718d59d82736df9fc65cf05b467cc7) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I26ed77790b5b00a15ea09a5a23a176de5b73a002
2016-09-18UPSTREAM: arm64: mm: add functions to walk page tables by PAMark Rutland
To allow us to walk tables allocated into the fixmap, we need to acquire the physical address of a page, rather than the virtual address in the linear map. This patch adds new p??_page_paddr and p??_offset_phys functions to acquire the physical address of a next-level table, and changes p??_offset* into macros which simply convert this to a linear map VA. This renders p??_page_vaddr unused, and hence they are removed. At the pgd level, a new pgd_offset_raw function is added to find the relevant PGD entry given the base of a PGD and a virtual address. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit dca56dca7124709f3dfca81afe61b4d98eb9cacf) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Ie43d00014322e29aec70617db3af242ed8545738
2016-09-18UPSTREAM: arm64: mm: move pte_* macrosMark Rutland
For pmd, pud, and pgd levels of table, functions including p?d_index and p?d_offset are defined after the p?d_page_vaddr function for the immediately higher level of table. The pte functions however are defined much earlier, even though several rely on the later definition of pmd_page_vaddr. While this isn't currently a problem as these are macros, it prevents the logical grouping of later C functions (which cannot rely on prototypes for functions not yet defined). Move these definitions after pmd_page_vaddr, for consistency with the placement of these functions for other levels of table. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 053520f7d3923cc6d37afb28f9887cb1e7d77454) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I44c7327dc0f257188c0c2204ec7da0e7fdca7f64
2016-09-18UPSTREAM: arm64: kasan: avoid TLB conflictsMark Rutland
The page table modification performed during the KASAN init risks the allocation of conflicting TLB entries, as it swaps a set of valid global entries for another without suitable TLB maintenance. The presence of conflicting TLB entries can result in the delivery of synchronous TLB conflict aborts, or may result in the use of erroneous data being returned in response to a TLB lookup. This can affect explicit data accesses from software as well as translations performed asynchronously (e.g. as part of page table walks or speculative I-cache fetches), and can therefore result in a wide variety of problems. To avoid this, use cpu_replace_ttbr1 to swap the page tables. This ensures that when the new tables are installed there are no stale entries from the old tables which may conflict. As all updates are made to the tables while they are not active, the updates themselves are safe. At the same time, add the missing barrier to ensure that the tmp_pg_dir entries updated via memcpy are visible to the page table walkers at the point the tmp_pg_dir is installed. All other page table updates made as part of KASAN initialisation have the requisite barriers due to the use of the standard page table accessors. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit c1a88e9124a499939ebd8069d5e4d3937f019157) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I0934b236e25e015d6946dcedcca14ba9512418f7
2016-09-18UPSTREAM: arm64: mm: add code to safely replace TTBR1_EL1Mark Rutland
If page tables are modified without suitable TLB maintenance, the ARM architecture permits multiple TLB entries to be allocated for the same VA. When this occurs, it is permitted that TLB conflict aborts are raised in response to synchronous data/instruction accesses, and/or and amalgamation of the TLB entries may be used as a result of a TLB lookup. The presence of conflicting TLB entries may result in a variety of behaviours detrimental to the system (e.g. erroneous physical addresses may be used by I-cache fetches and/or page table walks). Some of these cases may result in unexpected changes of hardware state, and/or result in the (asynchronous) delivery of SError. To avoid these issues, we must avoid situations where conflicting entries may be allocated into TLBs. For user and module mappings we can follow a strict break-before-make approach, but this cannot work for modifications to the swapper page tables that cover the kernel text and data. Instead, this patch adds code which is intended to be executed from the idmap, which can safely unmap the swapper page tables as it only requires the idmap to be active. This enables us to uninstall the active TTBR1_EL1 entry, invalidate TLBs, then install a new TTBR1_EL1 entry without potentially unmapping code or data required for the sequence. This avoids the risk of conflict, but requires that updates are staged in a copy of the swapper page tables prior to being installed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 50e1881ddde2a986c7d0d2150985239e5e3d7d96) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Ibd0a7c802a1abe0ba08a99819fd62ac646186005
2016-09-18UPSTREAM: arm64: add function to install the idmapMark Rutland
In some cases (e.g. when making invasive changes to the kernel page tables) we will need to execute code from the idmap. Add a new helper which may be used to install the idmap, complementing the existing cpu_uninstall_idmap. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 609116d202a8c5fd3fe393eb85373cbee906df68) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I1aac9e85e3d49add72c8aab7d80777a39f5fdd8e
2016-09-18UPSTREAM: arm64: unmap idmap earlierMark Rutland
During boot we leave the idmap in place until paging_init, as we previously had to wait for the zero page to become allocated and accessible. Now that we have a statically-allocated zero page, we can uninstall the idmap much earlier in the boot process, making it far easier to spot accidental use of physical addresses. This also brings the cold boot path in line with the secondary boot path. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 86ccce896cb0aa800a7a6dcd29b41ffc4eeb1a75) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I6375bd9855e45727790697875b7cd19f84a4dd7f
2016-09-18UPSTREAM: arm64: unify idmap removalMark Rutland
We currently open-code the removal of the idmap and restoration of the current task's MMU state in a few places. Before introducing yet more copies of this sequence, unify these to call a new helper, cpu_uninstall_idmap. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 9e8e865bbe294a69666a1996bda3e87825b258c0) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I6e9cb0253a1d2d63232f8fa0b3f39f8f6987b239
2016-09-18UPSTREAM: arm64: mm: place empty_zero_page in bssMark Rutland
Currently the zero page is set up in paging_init, and thus we cannot use the zero page earlier. We use the zero page as a reserved TTBR value from which no TLB entries may be allocated (e.g. when uninstalling the idmap). To enable such usage earlier (as may be required for invasive changes to the kernel page tables), and to minimise the time that the idmap is active, we need to be able to use the zero page before paging_init. This patch follows the example set by x86, by allocating the zero page at compile time, in .bss. This means that the zero page itself is available immediately upon entry to start_kernel (as we zero .bss before this), and also means that the zero page takes up no space in the raw Image binary. The associated struct page is allocated in bootmem_init, and remains unavailable until this time. Outside of arch code, the only users of empty_zero_page assume that the empty_zero_page symbol refers to the zeroed memory itself, and that ZERO_PAGE(x) must be used to acquire the associated struct page, following the example of x86. This patch also brings arm64 inline with these assumptions. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 5227cfa71f9e8574373f4d0e9e754942d76cdf67) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I43005dd8533d9968d2cef48bd2821d4507cae5ad
2016-09-18UPSTREAM: arm64: mm: specialise pagetable allocatorsMark Rutland
We pass a size parameter to early_alloc and late_alloc, but these are only ever used to allocate single pages. In late_alloc we always allocate a single page. Both allocators provide us with zeroed pages (such that all entries are invalid), but we have no barriers between allocating a page and adding that page to existing (live) tables. A concurrent page table walk may see stale data, leading to a number of issues. This patch specialises the two allocators for page tables. The size parameter is removed and the necessary dsb(ishst) is folded into each. To make it clear that the functions are intended for use for page table allocation, they are renamed to {early,late}_pgtable_alloc, with the related function pointed renamed to pgtable_alloc. As the dsb(ishst) is now in the allocator, the existing barrier for the zero page is redundant and thus is removed. The previously missing include of barrier.h is added. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 21ab99c289d350f4ae454bc069870009db6df20e) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Ib88fafc146506943122aa1ca6ee5a6c331ddb26c
2016-09-18UPSTREAM: asm-generic: Fix local variable shadow in __set_fixmap_offsetMark Rutland
Currently __set_fixmap_offset is a macro function which has a local variable called 'addr'. If a caller passes a 'phys' parameter which is derived from a variable also called 'addr', the local variable will shadow this, and the compiler will complain about the use of an uninitialized variable. To avoid the issue with namespace clashes, 'addr' is prefixed with a liberal sprinkling of underscores. Turning __set_fixmap_offset into a static inline breaks the build for several architectures. Fixing this properly requires updates to a number of architectures to make them agree on the prototype of __set_fixmap (it could be done as a subsequent patch series). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> [catalin.marinas@arm.com: squashed the original function patch and macro fixup] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 3694bd76781b76c4f8d2ecd85018feeb1609f0e5) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Iec27cb36dfca39e333de9f7318e76da0670d0156
2016-09-18BACKPORT: Eliminate the .eh_frame sections from the aarch64 vmlinux and ↵William Cohen
kernel modules By default the aarch64 gcc generates .eh_frame sections. Unlike .debug_frame sections, the .eh_frame sections are loaded into memory when the associated code is loaded. On an example kernel being built with this default the .eh_frame section in vmlinux used an extra 1.7MB of memory. The x86 disables the creation of the .eh_frame section. The aarch64 should probably do the same to save some memory. Signed-off-by: William Cohen <wcohen@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 728dabd6d1751cf5e0f8e0535891393da62396e9) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I697baae2209d6d11f2cc447459d935f7200eb7b1
2016-09-18UPSTREAM: arm64: Fix an enum typo in mm/dump.cMasanari Iida
This patch fixes a typo in mm/dump.c: "MODUELS_END_NR" should be "MODULES_END_NR". Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit b3122023df935cf14bf951da98ca598d71b9f826) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I5e73d18fd70f20f200a974f5f2ba22cb7ae64952
2016-09-18UPSTREAM: arm64: kasan: ensure that the KASAN zero page is mapped read-onlyArd Biesheuvel
When switching from the early KASAN shadow region, which maps the entire shadow space read-write, to the permanent KASAN shadow region, which uses a zero page to shadow regions that are not subject to instrumentation, the lowest level table kasan_zero_pte[] may be reused unmodified, which means that the mappings of the zero page that it contains will still be read-write. So update it explicitly to map the zero page read only when we activate the permanent mapping. Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 7b1af9795773d745c2a8c7d4ca5f2936e8b6adfb) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I4745dc91236c2612ad3e13be1cc176ffd923f7da
2016-09-18UPSTREAM: arch/arm/include/asm/pgtable-3level.h: add pmd_mkclean for THPMinchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite of the contents since MADV_FREE syscall is called for THP page. This patch adds pmd_mkclean for THP page MADV_FREE support. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Hugh Dickins <hughd@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Shaohua Li <shli@kernel.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 44842045e4baaf406db2954dd2e07152fa61528d) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I59d53667aa8c40dea4f18fc58acc7d27f4a85a04
2016-09-18UPSTREAM: arm64: hide __efistub_ aliases from kallsymsArd Biesheuvel
Commit e8f3010f7326 ("arm64/efi: isolate EFI stub from the kernel proper") isolated the EFI stub code from the kernel proper by prefixing all of its symbols with __efistub_, and selectively allowing access to core kernel symbols from the stub by emitting __efistub_ aliases for functions and variables that the stub can access legally. As an unintended side effect, these aliases are emitted into the kallsyms symbol table, which means they may turn up in backtraces, e.g., ... PC is at __efistub_memset+0x108/0x200 LR is at fixup_init+0x3c/0x48 ... [<ffffff8008328608>] __efistub_memset+0x108/0x200 [<ffffff8008094dcc>] free_initmem+0x2c/0x40 [<ffffff8008645198>] kernel_init+0x20/0xe0 [<ffffff8008085cd0>] ret_from_fork+0x10/0x40 The backtrace in question has nothing to do with the EFI stub, but simply returns one of the several aliases of memset() that have been recorded in the kallsyms table. This is undesirable, since it may suggest to people who are not aware of this that the issue they are seeing is somehow EFI related. So hide the __efistub_ aliases from kallsyms, by emitting them as absolute linker symbols explicitly. The distinction between those and section relative symbols is completely irrelevant to these definitions, and to the final link we are performing when these definitions are being taken into account (the distinction is only relevant to symbols defined inside a section definition when performing a partial link), and so the resulting values are identical to the original ones. Since absolute symbols are ignored by kallsyms, this will result in these values to be omitted from its symbol table. After this patch, the backtrace generated from the same address looks like this: ... PC is at __memset+0x108/0x200 LR is at fixup_init+0x3c/0x48 ... [<ffffff8008328608>] __memset+0x108/0x200 [<ffffff8008094dcc>] free_initmem+0x2c/0x40 [<ffffff8008645198>] kernel_init+0x20/0xe0 [<ffffff8008085cd0>] ret_from_fork+0x10/0x40 Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 75feee3d9d51775072d3a04f47d4a439a4c4590e) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Ieeb8c576e31f0dd0b7f82982ffa6362864eed311
2016-09-18UPSTREAM: arm64: head.S: use memset to clear BSSMark Rutland
Currently we use an open-coded memzero to clear the BSS. As it is a trivial implementation, it is sub-optimal. Our optimised memset doesn't use the stack, is position-independent, and for the memzero case can use of DC ZVA to clear large blocks efficiently. In __mmap_switched the MMU is on and there are no live caller-saved registers, so we can safely call an uninstrumented memset. This patch changes __mmap_switched to use memset when clearing the BSS. We use the __pi_memset alias so as to avoid any instrumentation in all kernel configurations. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 2a803c4db615d85126c5c7afd5849a3cfde71422) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I3dc7050fe5566f2126cbea9abfa6063c8e6b029a
2016-09-18UPSTREAM: efi: stub: define DISABLE_BRANCH_PROFILING for all architecturesArd Biesheuvel
This moves the DISABLE_BRANCH_PROFILING define from the x86 specific to the general CFLAGS definition for the stub. This fixes build errors when building for arm64 with CONFIG_PROFILE_ALL_BRANCHES_ENABLED. Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit b523e185bba36164ca48a190f5468c140d815414) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Id5a2a3986adc60ddf3d247c038667ce9719f3ee0
2016-09-18UPSTREAM: arm64: entry: remove pointless SPSR mode checkMark Rutland
In work_pending, we may skip work if the stacked SPSR value represents anything other than an EL0 context. We then immediately invoke the kernel_exit 0 macro as part of ret_to_user, assuming a return to EL0. This is somewhat confusing. We use work_pending as part of the ret_to_user/ret_fast_syscall state machine. We only use ret_fast_syscall in the return from an SVC issued from EL0. We use ret_to_user for return from EL0 exception handlers and also for return from ret_from_fork in the case the task was not a kernel thread (i.e. it is a user task). Thus in all cases the stacked SPSR value must represent an EL0 context, and the check is redundant. This patch removes it, along with the now unused no_work_pending label. Cc: Chris Metcalf <cmetcalf@ezchip.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit ee03353bc04f8e460cc4e3da80d9721d9ecb89f1) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I2738149342ef469e8cde7c5f8f7c65daab93fb2b
2016-09-18UPSTREAM: arm64: mm: move pgd_cache initialisation to pgtable_cache_initWill Deacon
Initialising the suppport for EFI runtime services requires us to allocate a pgd off the back of an early_initcall. On systems where the PGD_SIZE is smaller than PAGE_SIZE (e.g. 64k pages and 48-bit VA), the pgd_cache isn't initialised at this stage, and we panic with a NULL dereference during boot: Unable to handle kernel NULL pointer dereference at virtual address 00000000 __create_mapping.isra.5+0x84/0x350 create_pgd_mapping+0x20/0x28 efi_create_mapping+0x5c/0x6c arm_enable_runtime_services+0x154/0x1e4 do_one_initcall+0x8c/0x190 kernel_init_freeable+0x84/0x1ec kernel_init+0x10/0xe0 ret_from_fork+0x10/0x50 This patch fixes the problem by initialising the pgd_cache earlier, in the pgtable_cache_init callback, which sounds suspiciously like what it was intended for. Reported-by: Dennis Chen <dennis.chen@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 39b5be9b4233a9f212b98242bddf008f379b5122) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I124f03d19299be93124af35641294bf73c13bb22
2016-09-18UPSTREAM: arm64: traps: address fallout from printk -> pr_* conversionWill Deacon
Commit ac7b406c1a9d ("arm64: Use pr_* instead of printk") was a fairly mindless s/printk/pr_*/ change driven by a complaint from checkpatch. As is usual with such changes, this has led to some odd behaviour on arm64: * syslog now picks up the "pr_emerg" line from dump_backtrace, but not the actual trace, which leads to a bunch of "kernel:Call trace:" lines in the log * __{pte,pmd,pgd}_error print at KERN_CRIT, as opposed to KERN_ERR which is used by other architectures. This patch restores the original printk behaviour for dump_backtrace and downgrade the pgtable error macros to KERN_ERR. Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit c9cd0ed925c0b927283d4739bfe689eb9d1e9dfd) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Iaf028e5368df4623f7257ef432a7f9da86261609
2016-09-18UPSTREAM: arm64: ftrace: fix a stack tracer's output under function graph tracerAKASHI Takahiro
Function graph tracer modifies a return address (LR) in a stack frame to hook a function return. This will result in many useless entries (return_to_handler) showing up in a) a stack tracer's output b) perf call graph (with perf record -g) c) dump_backtrace (at panic et al.) For example, in case of a), $ echo function_graph > /sys/kernel/debug/tracing/current_tracer $ echo 1 > /proc/sys/kernel/stack_trace_enabled $ cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (54 entries) ----- ---- -------- 0) 4504 16 gic_raise_softirq+0x28/0x150 1) 4488 80 smp_cross_call+0x38/0xb8 2) 4408 48 return_to_handler+0x0/0x40 3) 4360 32 return_to_handler+0x0/0x40 ... In case of b), $ echo function_graph > /sys/kernel/debug/tracing/current_tracer $ perf record -e mem:XXX:x -ag -- sleep 10 $ perf report ... | | |--0.22%-- 0x550f8 | | | 0x10888 | | | el0_svc_naked | | | sys_openat | | | return_to_handler | | | return_to_handler ... In case of c), $ echo function_graph > /sys/kernel/debug/tracing/current_tracer $ echo c > /proc/sysrq-trigger ... Call trace: [<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30 [<ffffffc000092250>] return_to_handler+0x0/0x40 [<ffffffc000092250>] return_to_handler+0x0/0x40 ... This patch replaces such entries with real addresses preserved in current->ret_stack[] at unwind_frame(). This way, we can cover all the cases. Reviewed-by: Jungseok Lee <jungseoklee85@gmail.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> [will: fixed minor context changes conflicting with irq stack bits] Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 20380bb390a443b2c5c8800cec59743faf8151b4) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I6360182f8d04fdd2e31c0cb6054aefa2adb216e7
2016-09-18UPSTREAM: arm64: pass a task parameter to unwind_frame()AKASHI Takahiro
Function graph tracer modifies a return address (LR) in a stack frame to hook a function's return. This will result in many useless entries (return_to_handler) showing up in a call stack list. We will fix this problem in a later patch ("arm64: ftrace: fix a stack tracer's output under function graph tracer"). But since real return addresses are saved in ret_stack[] array in struct task_struct, unwind functions need to be notified of, in addition to a stack pointer address, which task is being traced in order to find out real return addresses. This patch extends unwind functions' interfaces by adding an extra argument of a pointer to task_struct. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit fe13f95b720075327a761fe6ddb45b0c90cab504) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I92a9a07468c182d5abbacaa73a90984ab11ad535
2016-09-18UPSTREAM: arm64: ftrace: modify a stack frame in a safe wayAKASHI Takahiro
Function graph tracer modifies a return address (LR) in a stack frame by calling ftrace_prepare_return() in a traced function's function prologue. The current code does this modification before preserving an original address at ftrace_push_return_trace() and there is always a small window of inconsistency when an interrupt occurs. This doesn't matter, as far as an interrupt stack is introduced, because stack tracer won't be invoked in an interrupt context. But it would be better to proactively minimize such a window by moving the LR modification after ftrace_push_return_trace(). Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 79fdee9b6355c9720f14717e1ad66af51bb331b5) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Ief0a136aa01348b4d0d3f447544f21fd77835c67
2016-09-18UPSTREAM: arm64: remove irq_count and do_softirq_own_stack()James Morse
sysrq_handle_reboot() re-enables interrupts while on the irq stack. The irq_stack implementation wrongly assumed this would only ever happen via the softirq path, allowing it to update irq_count late, in do_softirq_own_stack(). This means if an irq occurs in sysrq_handle_reboot(), during emergency_restart() the stack will be corrupted, as irq_count wasn't updated. Lose the optimisation, and instead of moving the adding/subtracting of irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare sp_el0 (struct thread_info) with sp & ~(THREAD_SIZE - 1). This tells us if we are on a task stack, if so, we can safely switch to the irq stack. Finally, remove do_softirq_own_stack(), we don't need it anymore. Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: James Morse <james.morse@arm.com> [will: use get_thread_info macro] Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit d224a69e3d80fe08f285d1f41d21b590bae4fa9f) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I1f613279bf875443b10d65b1cd1ed4a6abfcb605
2016-09-18UPSTREAM: arm64: hugetlb: add support for PTE contiguous bitDavid Woods
The arm64 MMU supports a Contiguous bit which is a hint that the TTE is one of a set of contiguous entries which can be cached in a single TLB entry. Supporting this bit adds new intermediate huge page sizes. The set of huge page sizes available depends on the base page size. Without using contiguous pages the huge page sizes are as follows. 4KB: 2MB 1GB 64KB: 512MB With a 4KB granule, the contiguous bit groups together sets of 16 pages and with a 64KB granule it groups sets of 32 pages. This enables two new huge page sizes in each case, so that the full set of available sizes is as follows. 4KB: 64KB 2MB 32MB 1GB 64KB: 2MB 512MB 16GB If a 16KB granule is used then the contiguous bit groups 128 pages at the PTE level and 32 pages at the PMD level. If the base page size is set to 64KB then 2MB pages are enabled by default. It is possible in the future to make 2MB the default huge page size for both 4KB and 64KB granules. Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com> Reviewed-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: David Woods <dwoods@ezchip.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 66b3923a1a0f77a563b43f43f6ad091354abbfe9) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I5e99c5165bc5eb966adf4d4523632fd9eedd9602
2016-09-18BACKPORT: arm64: Use PoU cache instr for I/D coherencyAshok Kumar
In systems with three levels of cache(PoU at L1 and PoC at L3), PoC cache flush instructions flushes L2 and L3 caches which could affect performance. For cache flushes for I and D coherency, PoU should suffice. So changing all I and D coherency related cache flushes to PoU. Introduced a new __clean_dcache_area_pou API for dcache flush till PoU and provided a common macro for __flush_dcache_area and __clean_dcache_area_pou. Also, now in __sync_icache_dcache, icache invalidation for non-aliasing VIPT icache is done only for that particular page instead of the earlier __flush_icache_all. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ashok Kumar <ashoks@broadcom.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 0a28714c53fd4f7aea709be7577dfbe0095c8c3e) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I64f065140d5e8783e91ed53ae9c7a2e33a3e515a
2016-09-18BACKPORT: arm64: kernel: fix architected PMU registers unconditional accessLorenzo Pieralisi
The Performance Monitors extension is an optional feature of the AArch64 architecture, therefore, in order to access Performance Monitors registers safely, the kernel should detect the architected PMU unit presence through the ID_AA64DFR0_EL1 register PMUVer field before accessing them. This patch implements a guard by reading the ID_AA64DFR0_EL1 register PMUVer field to detect the architected PMU presence and prevent accessing PMU system registers if the Performance Monitors extension is not implemented in the core. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Fixes: 60792ad349f3 ("arm64: kernel: enforce pmuserenr_el0 initialization and restore") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit f436b2ac90a095746beb6729b8ee8ed87c9eaede) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Ie442b9feba5d143bd6b6c82d70190fc8bc95298d
2016-09-18UPSTREAM: arm64: Defer dcache flush in __cpu_copy_user_pageAshok Kumar
Defer dcache flushing to __sync_icache_dcache by calling flush_dcache_page which clears PG_dcache_clean flag. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ashok Kumar <ashoks@broadcom.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit e6b1185f77351aa154e63bd54b05d07ff99d4ffa) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I2e02ce2f43f68287337bed30e3c3455c0eee4164
2016-09-18UPSTREAM: arm64: reduce stack use in irq_handlerJames Morse
The code for switching to irq_stack stores three pieces of information on the stack, fp+lr, as a fake stack frame (that lets us walk back onto the interrupted tasks stack frame), and the address of the struct pt_regs that contains the register values from kernel entry. (which dump_backtrace() will print in any stack trace). To reduce this, we store fp, and the pointer to the struct pt_regs. unwind_frame() can recognise this as the irq_stack dummy frame, (as it only appears at the top of the irq_stack), and use the struct pt_regs values to find the missing interrupted link-register. Suggested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 971c67ce37cfeeaf560e792a2c3bc21d8b67163a) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I84cbb04857a441083d331e875c3e228d24ec2276
2016-09-18UPSTREAM: arm64: Documentation: add list of software workarounds for errataWill Deacon
It's not immediately obvious which hardware errata are worked around in the Linux kernel for an arbitrary kernel tree, so add a file to keep track of what we're working around. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 9cb9c9e5ba8453537e8e645318edf231fe54eaf9) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I139972ff6e10e12c4b4f27cd047f55b3dd8b4118
2016-09-18UPSTREAM: arm64: mm: place __cpu_setup in .textMark Rutland
We drop __cpu_setup in .text.init, which ends up being part of .text. The .text.init section was a legacy section name which has been unused elsewhere for a long time. The ".text.init" name is misleading if read as a synonym for ".init.text". Any CPU may execute __cpu_setup before turning the MMU on, so it should simply live in .text. Remove the pointless section assignment. This will leave __cpu_setup in the .text section. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit f00083cae331e5d3eecade6b4fdc35d0825e73ef) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I2e9b154cd6de92662c70c2b957479448252c661a
2016-09-18UPSTREAM: arm64: cmpxchg: Don't incldue linux/mmdebug.hMark Brown
The arm64 asm/cmpxchg.h includes linux/mmdebug.h but doesn't so far as I can tell actually use anything from it. Removing the inclusion reduces spurious header dependency rebuilds and also avoids issues with recursive inclusions of headers causing build breaks due to attempts to use things before they are defined if linux/mmdebug.h starts pulling in more low level headers. Such errors have happened in -next recently, for example: In file included from include/linux/completion.h:11:0, from include/linux/rcupdate.h:43, from include/linux/tracepoint.h:19, from include/linux/mmdebug.h:6, from ./arch/arm64/include/asm/cmpxchg.h:22, from ./arch/arm64/include/asm/atomic.h:41, from include/linux/atomic.h:4, from include/linux/spinlock.h:406, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:19, from arch/arm64/kernel/asm-offsets.c:21: include/linux/wait.h: In function 'wait_on_atomic_t': include/linux/wait.h:1218:2: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration] if (atomic_read(val) == 0) Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 4a6ccf30263f4e265c0f171561bf4c40bed5f273) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Idf44176dad0abc11e67b4e416b02a3fba14f3f1b
2016-09-18UPSTREAM: arm64: mm: fold alternatives into .initMark Rutland
Currently we treat the alternatives separately from other data that's only used during initialisation, using separate .altinstructions and .altinstr_replacement linker sections. These are freed for general allocation separately from .init*. This is problematic as: * We do not remove execute permissions, as we do for .init, leaving the memory executable. * We pad between them, making the kernel Image bianry up to PAGE_SIZE bytes larger than necessary. This patch moves the two sections into the contiguous region used for .init*. This saves some memory, ensures that we remove execute permissions, and allows us to remove some code made redundant by this reorganisation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 9aa4ec1571da62366cfddc20f3b923609604fe63) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I3fee118ead5c73ade50ea10d436881c9424a549c
2016-09-18BACKPORT: arm64: Remove redundant padding from linker scriptMark Rutland
Currently we place an ALIGN_DEBUG_RO between text and data for the .text and .init sections, and depending on configuration each of these may result in up to SECTION_SIZE bytes worth of padding (for DEBUG_RODATA_ALIGN). We make no distinction between the text and data in each of these sections at any point when creating the initial page tables in head.S. We also make no distinction when modifying the tables; __map_memblock, fixup_executable, mark_rodata_ro, and fixup_init only work at section granularity. Thus this padding is unnecessary. For the spit between init text and data we impose a minimum alignment of 16 bytes, but this is also unnecessary. The init data is output immediately after the padding before any symbols are defined, so this is not required to keep a symbol for linker a section array correctly associated with the data. Any objects within the section will be given at least their usual alignment regardless. This patch removes the redundant padding. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit 5b28cd9d084eca8ddc46270d2720305bfd40e348) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I5540ba1f4d90e2ae8fafa22e98c389bc4e975ac7
2016-09-18UPSTREAM: arm64: mm: remove pointless PAGE_MASKingMark Rutland
As pgd_offset{,_k} shift the input address by PGDIR_SHIFT, the sub-page bits will always be shifted out. There is no need to apply PAGE_MASK before this. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Bug: 30369029 Patchset: rework-pagetable (cherry picked from commit e2c30ee320eb96304896c7ab84499e5bc5e5fb6e) Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: I86d3438aecf7295d5895e1430c1e19fbc82c9719
2016-09-19net: inet: diag: expose the socket mark to privileged processes.Lorenzo Colitti
This adds the capability for a process that has CAP_NET_ADMIN on a socket to see the socket mark in socket dumps. Commit a52e95abf772 ("net: diag: allow socket bytecode filters to match socket marks") recently gave privileged processes the ability to filter socket dumps based on mark. This patch is complementary: it ensures that the mark is also passed to userspace in the socket's netlink attributes. It is useful for tools like ss which display information about sockets. [backport of net-next d545caca827b65aab557a9e9dcdcf1e5a3823c2d] Change-Id: I33336ed9c3ee3fb78fe05c4c47b7fd18c6e33ef1 Tested: https://android-review.googlesource.com/270210 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>