summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'akpm' (more incoming from Andrew)Linus Torvalds2013-02-2423-44/+673
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge second patch-bomb from Andrew Morton: - A little DM fix - the MM queue * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (154 commits) ksm: allocate roots when needed mm: cleanup "swapcache" in do_swap_page mm,ksm: swapoff might need to copy mm,ksm: FOLL_MIGRATION do migration_entry_wait ksm: shrink 32-bit rmap_item back to 32 bytes ksm: treat unstable nid like in stable tree ksm: add some comments tmpfs: fix mempolicy object leaks tmpfs: fix use-after-free of mempolicy object mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages mm: export mmu notifier invalidates mm: accelerate mm_populate() treatment of THP pages mm: use long type for page counts in mm_populate() and get_user_pages() mm: accurately document nr_free_*_pages functions with code comments HWPOISON: change order of error_states[]'s elements HWPOISON: fix misjudgement of page_action() for errors on mlocked pages memcg: stop warning on memcg_propagate_kmem net: change type of virtio_chan->p9_max_pages vmscan: change type of vm_total_pages to unsigned long fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used ...
| * ia64: use %ld to print pages calculated in nr_free_buffer_pagesZhang Yanfei2013-02-242-2/+2
| | | | | | | | | | | | | | | | | | Now the function nr_free_buffer_pages returns unsigned long, so use %ld to print its return value. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * swap: add per-partition lock for swapfileShaohua Li2013-02-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | swap_lock is heavily contended when I test swap to 3 fast SSD (even slightly slower than swap to 2 such SSD). The main contention comes from swap_info_get(). This patch tries to fix the gap with adding a new per-partition lock. Global data like nr_swapfiles, total_swap_pages, least_priority and swap_list are still protected by swap_lock. nr_swap_pages is an atomic now, it can be changed without swap_lock. In theory, it's possible get_swap_page() finds no swap pages but actually there are free swap pages. But sounds not a big problem. Accessing partition specific data (like scan_swap_map and so on) is only protected by swap_info_struct.lock. Changing swap_info_struct.flags need hold swap_lock and swap_info_struct.lock, because scan_scan_map() will check it. read the flags is ok with either the locks hold. If both swap_lock and swap_info_struct.lock must be hold, we always hold the former first to avoid deadlock. swap_entry_free() can change swap_list. To delete that code, we add a new highest_priority_index. Whenever get_swap_page() is called, we check it. If it's valid, we use it. It's a pity get_swap_page() still holds swap_lock(). But in practice, swap_lock() isn't heavily contended in my test with this patch (or I can say there are other much more heavier bottlenecks like TLB flush). And BTW, looks get_swap_page() doesn't really need the lock. We never free swap_info[] and we check SWAP_WRITEOK flag. The only risk without the lock is we could swapout to some low priority swap, but we can quickly recover after several rounds of swap, so sounds not a big deal to me. But I'd prefer to fix this if it's a real problem. "swap: make each swap partition have one address_space" improved the swapout speed from 1.7G/s to 2G/s. This patch further improves the speed to 2.3G/s, so around 15% improvement. It's a multi-process test, so TLB flush isn't the biggest bottleneck before the patches. [arnd@arndb.de: fix it for nommu] [hughd@google.com: add missing unlock] [minchan@kernel.org: get rid of lockdep whinge on sys_swapon] Signed-off-by: Shaohua Li <shli@fusionio.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * acpi, memory-hotplug: support getting hotplug info from SRATTang Chen2013-02-241-5/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT: |_____| |_____| |_________| |_________| ...... * node id: 0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_____| |_________| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. If all the memory ranges in SRAT is hotpluggable, then no memory can be used by kernel. But before parsing SRAT, memblock has already reserve some memory ranges for other purposes, such as for kernel image, and so on. We cannot prevent kernel from using these memory. So we need to exclude these ranges even if these memory is hotpluggable. Furthermore, there could be several memory ranges in the single node which the kernel resides in. We may skip one range that have memory reserved by memblock, but if the rest of memory is too small, then the kernel will fail to boot. So, make the whole node which the kernel resides in un-hotpluggable. Then the kernel has enough memory to use. NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. [akpm@linux-foundation.org: fix warning] [akpm@linux-foundation.org: use strcmp()] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * acpi, memory-hotplug: extend movablemem_map ranges to the end of nodeTang Chen2013-02-241-3/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When implementing movablemem_map boot option, we introduced an array movablemem_map.map[] to store the memory ranges to be set as ZONE_MOVABLE. Since ZONE_MOVABLE is the latst zone of a node, if user didn't specify the whole node memory range, we need to extend it to the node end so that we can use it to prevent memblock from allocating memory in the ranges user didn't specify. We now implement movablemem_map boot option like this: /* * For movablemem_map=nn[KMG]@ss[KMG]: * * SRAT: |_____| |_____| |_________| |_________| ...... * node id: 0 1 1 2 * user specified: |__| |___| * movablemem_map: |___| |_________| |______| ...... * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. * * NOTE: In this case, SRAT info will be ingored. */ [akpm@linux-foundation.org: clean up code, fix build warning] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * acpi, memory-hotplug: parse SRAT before memblock is readyTang Chen2013-02-242-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On linux, the pages used by kernel could not be migrated. As a result, if a memory range is used by kernel, it cannot be hot-removed. So if we want to hot-remove memory, we should prevent kernel from using it. The way now used to prevent this is specify a memory range by movablemem_map boot option and set it as ZONE_MOVABLE. But when the system is booting, memblock will allocate memory, and reserve the memory for kernel. And before we parse SRAT, and know the node memory ranges, memblock is working. And it may allocate memory in ranges to be set as ZONE_MOVABLE. This memory can be used by kernel, and never be freed. So, let's parse SRAT before memblock is called first. And it is early enough. The first call of memblock_find_in_range_node() is in: setup_arch() |-->setup_real_mode() so, this patch add a function early_parse_srat() to parse SRAT, and call it before setup_real_mode() is called. NOTE: 1) early_parse_srat() is called before numa_init(), and has initialized numa_meminfo. So DO NOT clear numa_nodes_parsed in numa_init() and DO NOT zero numa_meminfo in numa_init(), otherwise we will lose memory numa info. 2) I don't know why using count of memory affinities parsed from SRAT as a return value in original acpi_numa_init(). So I add a static variable srat_mem_cnt to remember this count and use it as the return value of the new acpi_numa_init() [mhocko@suse.cz: parse SRAT before memblock is ready fix] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Wen Congyang <wency@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * x86: get pg_data_t's memory from other nodeYasuaki Ishimatsu2013-02-241-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the implementation of SRAT support, we met a problem. In setup_arch(), we have the following call series: 1) memblock is ready; 2) some functions use memblock to allocate memory; 3) parse ACPI tables, such as SRAT. Before 3), we don't know which memory is hotpluggable, and as a result, we cannot prevent memblock from allocating hotpluggable memory. So, in 2), there could be some hotpluggable memory allocated by memblock. Now, we are trying to parse SRAT earlier, before memblock is ready. But I think we need more investigation on this topic. So in this v5, I dropped all the SRAT support, and v5 is just the same as v3, and it is based on 3.8-rc3. As we planned, we will support getting info from SRAT without users' participation at last. And we will post another patch-set to do so. And also, I think for now, we can add this boot option as the first step of supporting movable node. Since Linux cannot migrate the direct mapped pages, the only way for now is to limit the whole node containing only movable memory. Using SRAT is one way. But even if we can use SRAT, users still need an interface to enable/disable this functionality if they don't want to loose their NUMA performance. So I think, a user interface is always needed. For now, users can disable this functionality by not specifying the boot option. Later, we will post SRAT support, and add another option value "movablecore_map=acpi" to using SRAT. This patch: If system can create movable node which all memory of the node is allocated as ZONE_MOVABLE, setup_node_data() cannot allocate memory for the node's pg_data_t. So, use memblock_alloc_try_nid() instead of memblock_alloc_nid() to retry when the first allocation fails. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * cpu-hotplug,memory-hotplug: clear cpu_to_node() when offlining the nodeWen Congyang2013-02-242-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the node is offlined, there is no memory/cpu on the node. If a sleep task runs on a cpu of this node, it will be migrated to the cpu on the other node. So we can clear cpu-to-node mapping. [akpm@linux-foundation.org: numa_clear_node() and numa_set_node() can no longer be __cpuinit] Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * cpu_hotplug: clear apicid to node when the cpu is hotremovedWen Congyang2013-02-242-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a cpu is hotpluged, we call acpi_map_cpu2node() in _acpi_map_lsapic() to store the cpu's node and apicid's node. But we don't clear the cpu's node in acpi_unmap_lsapic() when this cpu is hotremoved. If the node is also hotremoved, we will get the following messages: kernel BUG at include/linux/gfp.h:329! invalid opcode: 0000 [#1] SMP Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel microcode pcspkr i2c_i801 i2c_core lpc_ich mfd_core ioatdma e1000e i7core_edac edac_core sg acpi_memhotplug igb dca sd_mod crc_t10dif megaraid_sas mptsas mptscsih mptbase scsi_transport_sas scsi_mod Pid: 3126, comm: init Not tainted 3.6.0-rc3-tangchen-hostbridge+ #13 FUJITSU-SV PRIMEQUEST 1800E/SB RIP: 0010:[<ffffffff811bc3fd>] [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300 RSP: 0018:ffff88078a049cf8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000246 RBP: ffff88078a049d38 R08: 00000000000040d0 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000b5f R12: 00000000000052d0 R13: ffff8807c1417300 R14: 0000000000030038 R15: 0000000000000003 FS: 00007fa9b1b44700(0000) GS:ffff8807c3800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fa9b09acca0 CR3: 000000078b855000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process init (pid: 3126, threadinfo ffff88078a048000, task ffff8807bb6f2650) Call Trace: new_slab+0x30/0x1b0 __slab_alloc+0x358/0x4c0 kmem_cache_alloc_node_trace+0xb4/0x1e0 alloc_fair_sched_group+0xd0/0x1b0 sched_create_group+0x3e/0x110 sched_autogroup_create_attach+0x4d/0x180 sys_setsid+0xd4/0xf0 system_call_fastpath+0x16/0x1b Code: 89 c4 e9 73 fe ff ff 31 c0 89 de 48 c7 c7 45 de 9e 81 44 89 45 c8 e8 22 05 4b 00 85 db 44 8b 45 c8 0f 89 4f ff ff ff 0f 0b eb fe <0f> 0b 90 eb fd 0f 0b eb fe 89 de 48 c7 c7 45 de 9e 81 31 c0 44 RIP [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300 RSP <ffff88078a049cf8> ---[ end trace adf84c90f3fea3e5 ]--- The reason is that the cpu's node is not NUMA_NO_NODE, we will call alloc_pages_exact_node() to alloc memory on the node, but the node is offlined. If the node is onlined, we still need cpu's node. For example: a task on the cpu is sleeped when the cpu is hotremoved. We will choose another cpu to run this task when it is waked up. If we know the cpu's node, we will choose the cpu on the same node first. So we should clear cpu-to-node mapping when the node is offlined. This patch only clears apicid-to-node mapping when the cpu is hotremoved. [akpm@linux-foundation.org: fix section error] Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * memory-hotplug: remove memmap of sparse-vmemmapTang Chen2013-02-246-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new API vmemmap_free() to free and remove vmemmap pagetables. Since pagetable implements are different, each architecture has to provide its own version of vmemmap_free(), just like vmemmap_populate(). Note: vmemmap_free() is not implemented for ia64, ppc, s390, and sparc. [mhocko@suse.cz: fix implicit declaration of remove_pagetable] Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * memory-hotplug: remove page table of x86_64 architectureTang Chen2013-02-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Search a page table about the removed memory, and clear page table for x86_64 architecture. [akpm@linux-foundation.org: make kernel_physical_mapping_remove() static] Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * memory-hotplug: common APIs to support page tables hot-removeWen Congyang2013-02-243-22/+330
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When memory is removed, the corresponding pagetables should alse be removed. This patch introduces some common APIs to support vmemmap pagetable and x86_64 architecture direct mapping pagetable removing. All pages of virtual mapping in removed memory cannot be freed if some pages used as PGD/PUD include not only removed memory but also other memory. So this patch uses the following way to check whether a page can be freed or not. 1) When removing memory, the page structs of the removed memory are filled with 0FD. 2) All page structs are filled with 0xFD on PT/PMD, PT/PMD can be cleared. In this case, the page used as PT/PMD can be freed. For direct mapping pages, update direct_pages_count[level] when we freed their pagetables. And do not free the pages again because they were freed when offlining. For vmemmap pages, free the pages and their pagetables. For larger pages, do not split them into smaller ones because there is no way to know if the larger page has been split. As a result, there is no way to decide when to split. We deal the larger pages in the following way: 1) For direct mapped pages, all the pages were freed when they were offlined. And since menmory offline is done section by section, all the memory ranges being removed are aligned to PAGE_SIZE. So only need to deal with unaligned pages when freeing vmemmap pages. 2) For vmemmap pages being used to store page_struct, if part of the larger page is still in use, just fill the unused part with 0xFD. And when the whole page is fulfilled with 0xFD, then free the larger page. [akpm@linux-foundation.org: fix typo in comment] [tangchen@cn.fujitsu.com: do not calculate direct mapping pages when freeing vmemmap pagetables] [tangchen@cn.fujitsu.com: do not free direct mapping pages twice] [tangchen@cn.fujitsu.com: do not free page split from hugepage one by one] [tangchen@cn.fujitsu.com: do not split pages when freeing pagetable pages] [akpm@linux-foundation.org: use pmd_page_vaddr()] [akpm@linux-foundation.org: fix used-uninitialised bug] Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * memory-hotplug: implement register_page_bootmem_info_section of sparse-vmemmapYasuaki Ishimatsu2013-02-244-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For removing memmap region of sparse-vmemmap which is allocated bootmem, memmap region of sparse-vmemmap needs to be registered by get_page_bootmem(). So the patch searches pages of virtual mapping and registers the pages by get_page_bootmem(). NOTE: register_page_bootmem_memmap() is not implemented for ia64, ppc, s390, and sparc. So introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform doesn't support it. It's implemented by adding a new Kconfig option named CONFIG_HAVE_BOOTMEM_INFO_NODE, which will be automatically selected by memory-hotplug feature fully supported archs(currently only on x86_64). Since we have 2 config options called MEMORY_HOTPLUG and MEMORY_HOTREMOVE used for memory hot-add and hot-remove separately, and codes in function register_page_bootmem_info_node() are only used for collecting infomation for hot-remove, so reside it under MEMORY_HOTREMOVE. Besides page_isolation.c selected by MEMORY_ISOLATION under MEMORY_HOTPLUG is also such case, move it too. [mhocko@suse.cz: put register_page_bootmem_memmap inside CONFIG_MEMORY_HOTPLUG_SPARSE] [linfeng@cn.fujitsu.com: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node()] [mhocko@suse.cz: remove the arch specific functions without any implementation] [linfeng@cn.fujitsu.com: mm/Kconfig: move auto selects from MEMORY_HOTPLUG to MEMORY_HOTREMOVE as needed] [rientjes@google.com: fix defined but not used warning] Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Wu Jianguo <wujianguo@huawei.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * memory-hotplug: introduce new arch_remove_memory() for removing page tableWen Congyang2013-02-247-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For removing memory, we need to remove page tables. But it depends on architecture. So the patch introduce arch_remove_memory() for removing page table. Now it only calls __remove_pages(). Note: __remove_pages() for some archtecuture is not implemented (I don't know how to implement it for s390). Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm: remove flags argument to mmap_regionMichel Lespinasse2013-02-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the MAP_POPULATE handling has been moved to mmap_region() call sites, the only remaining use of the flags argument is to pass the MAP_NORESERVE flag. This can be just as easily handled by do_mmap_pgoff(), so do that and remove the mmap_region() flags parameter. [akpm@linux-foundation.org: remove double parens] Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Andy Lutomirski <luto@amacapital.net> Cc: Greg Ungerer <gregungerer@westnet.com.au> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'next' of ↵Linus Torvalds2013-02-24185-5189/+6664
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Benjamin Herrenschmidt: "So from the depth of frozen Minnesota, here's the powerpc pull request for 3.9. It has a few interesting highlights, in addition to the usual bunch of bug fixes, minor updates, embedded device tree updates and new boards: - Hand tuned asm implementation of SHA1 (by Paulus & Michael Ellerman) - Support for Doorbell interrupts on Power8 (kind of fast thread-thread IPIs) by Ian Munsie - Long overdue cleanup of the way we handle relocation of our open firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard - Support for saving/restoring & context switching the PPR (Processor Priority Register) on server processors that support it. This allows the kernel to preserve thread priorities established by userspace. By Haren Myneni. - DAWR (new watchpoint facility) support on Power8 by Michael Neuling - Ability to change the DSCR (Data Stream Control Register) which controls cache prefetching on a running process via ptrace by Alexey Kardashevskiy - Support for context switching the TAR register on Power8 (new branch target register meant to be used by some new specific userspace perf event interrupt facility which is yet to be enabled) by Ian Munsie. - Improve preservation of the CFAR register (which captures the origin of a branch) on various exception conditions by Paulus. - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where it belongs by Philippe De Muyter - Support for Transactional Memory on Power8 by Michael Neuling (based on original work by Matt Evans). For those curious about the feature, the patch contains a pretty good description." (See commit db8ff907027b: "powerpc: Documentation for transactional memory on powerpc" for the mentioned description added to the file Documentation/powerpc/transactional_memory.txt) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits) powerpc/kexec: Disable hard IRQ before kexec powerpc/85xx: l2sram - Add compatible string for BSC9131 platform powerpc/85xx: bsc9131 - Correct typo in SDHC device node powerpc/e500/qemu-e500: enable coreint powerpc/mpic: allow coreint to be determined by MPIC version powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct powerpc/85xx: Board support for ppa8548 powerpc/fsl: remove extraneous DIU platform functions arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test powerpc: Documentation for transactional memory on powerpc powerpc: Add transactional memory to pseries and ppc64 defconfigs powerpc: Add config option for transactional memory powerpc: Add transactional memory to POWER8 cpu features powerpc: Add new transactional memory state to the signal context powerpc: Hook in new transactional memory code powerpc: Routines for FP/VSX/VMX unavailable during a transaction powerpc: Add transactional memory unavaliable execption handler powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes powerpc: Add FP/VSX and VMX register load functions for transactional memory powerpc: Add helper functions for transactional memory context switching ...
| * | powerpc/kexec: Disable hard IRQ before kexecPhileas Fogg2013-02-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable hard IRQ before kexec a new kernel image. Not doing it can result in corrupted data in the memory segments reserved for the new kernel. Signed-off-by: Phileas Fogg <phileas-fogg@mail.ru> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | Merge remote-tracking branch 'agust/next' into nextBenjamin Herrenschmidt2013-02-2041-3464/+656
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | << Please pull mpc5xxx patches for v3.9. The bestcomm driver is moved to drivers/dma (so it will be usable for ColdFire). mpc5121 now provides common dtsi file and existing mpc5121 device trees use it. There are some minor clock init and sparse fixes and updates for various 5200 device tree files from Grant. Some fixes for bugs in the mpc5121 DIU driver are also included here (Andrew Morton suggested to push them via my mpc5xxx tree). >>
| | * | powerpc/5200: Use the gpt* labels to simplify mpc5200 dts filesGrant Likely2013-02-1111-235/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DTC labels feature allows a dts file to reference a node without having to reproduce the entire node hierarchy above it. We can use this to simplify the MPC5200 board dts files by referencing the gpt nodes by label. Cc: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> [agust: fixed gpt7 phandle in the csi node of o2d.dtsi] Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/5200: Add Lite5200 on-board LEDs as devicesGrant Likely2013-02-112-12/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Lite5200 evaluation board has a number of debug LEDs that Linux doesn't know about yet. This change adds a gpio-leds stanza to the lite5200 device tree so that the correct driver can get hooked up. Also, make use of the dtc labels feature to reduce the number of source lines required to add the gpio-controller property to the general purpose timer nodes. In addition, the required #gpio-cells properties are added to the common mpc5200b dtsi include file so that each board doesn't need to add them explicitly. This still doesn't enable gpio mode, 'gpio-controller' is required for that, but it means less work needs to be done by board ports. Cc: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
| | * | powerpc/mpc5xxx: fix sparse warning for non static symbolAnatolij Gustschin2013-02-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix warning: symbol 'mpc5xxx_get_bus_frequency' was not declared. Should it be static? Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/mpc512x: fix sparce warnings for non static symbolsAnatolij Gustschin2013-02-052-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix warnings: symbol 'clockctl' was not declared. Should it be static? symbol 'rate_clks' was not declared. Should it be static? symbol 'dev_clks' was not declared. Should it be static? symbol 'mpc5121_clk_init' was not declared. Should it be static? Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/mpc512x: fix noderef sparse warningsAnatolij Gustschin2013-02-051-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix: warning: dereference of noderef expression Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/512x: add function for chip select parameter configurationAnatolij Gustschin2013-02-052-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ability to configure chip select (CS) parameters for devices that need different CS parameters setup after their configuration. I.e. an FPGA device on LP bus can require different CS parameters for its bus interface after loading firmware into it. A driver can easily reconfigure the LPC CS parameters using this function. Acked-by: Timur Tabi <timur@tabi.org> Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/512x: initialize clocks before bus probingAnatolij Gustschin2013-01-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Early driver probing can fail due to not available clocks (clk_get() fails) since the clk API init didn't take place yet. Move clocks init before bus probing. Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | mpc5121: don't check PSC ac97 using node nameAnatolij Gustschin2013-01-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The .dtsi now names all PSC nodes as "psc", so this ac97 check won't work. Check for ac97 PSC using compatible property. Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | mpc5121: remove obsolete cell-index property from PSC clock codeAnatolij Gustschin2013-01-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't use cell-index from device tree, obtain the PSC number from PSCx register offset. Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/mpc5121: pdm360ng.dts: use common mpc5121.dtsiAnatolij Gustschin2013-01-151-242/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Change dts file for pdm360ng board to use common mpc5121 SoC dtsi file. Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc/mpc5121: add common .dtsi and use it in mpc5121ads.dtsAnatolij Gustschin2013-01-152-280/+449
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide common mpc5121.dtsi file for mpc5121 SoC and modify mpc5121ads.dts to use mpc5121.dtsi. Signed-off-by: Anatolij Gustschin <agust@denx.de>
| | * | powerpc, dma: move bestcomm driver from arch/powerpc/sysdev to drivers/dmaPhilippe De Muyter2013-01-0321-2676/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bestcomm dma hardware, and some of its users like the FEC ethernet component, is used in different FreeScale parts, including non-powerpc parts like the ColdFire MCF547x & MCF548x families. Don't keep the driver hidden in arch/powerpc where it is inaccessible for other arches. .c files are moved to drivers/dma/bestcomm, while .h files are moved to include/linux/fsl/bestcomm. Makefiles, Kconfigs and #include directives are updated for the new file locations. Tested by recompiling for MPC5200 with all bestcomm users enabled. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Signed-off-by: Anatolij Gustschin <agust@denx.de>
| * | | Merge remote-tracking branch 'kumar/next' into nextBenjamin Herrenschmidt2013-02-1935-560/+1369
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | << Mostly misc code cleanups in various board ports and adding support for a new MPC85xx board - ppa8548. >>
| | * | | powerpc/85xx: l2sram - Add compatible string for BSC9131 platformHarninder Rai2013-02-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Harninder Rai <harninder.rai@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: bsc9131 - Correct typo in SDHC device nodeHarninder Rai2013-02-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BSC9131RDB doesn't have SDHC enabled. As a result of this typo, the node was not getting disabled from the device tree which was leading to linux hang during bootup Signed-off-by: Harninder Rai <harninder.rai@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/e500/qemu-e500: enable coreintScott Wood2013-02-151-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MPIC code will disable coreint if it detects an insufficient MPIC version. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/mpic: allow coreint to be determined by MPIC versionScott Wood2013-02-151-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will be used by the qemu-e500 platform, as the MPIC version (and thus whether we have coreint) depends on how QEMU is configured. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr structVarun Sethi2013-02-152-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pci controller structure has a provision to store the device structure pointer of the corresponding platform device. Currently this information is not stored during fsl pci controller initialization. This information is required while dealing with iommu groups for pci devices connected to the fsl pci controller. For the case where the pci devices can't be paritioned, they would fall under the same device group as the pci controller. This patch stores the platform device information in the pci controller structure during initialization. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: Board support for ppa8548Stef van Os2013-02-155-0/+337
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial board support for the Prodrive PPA8548 AMC module. Board is an MPC8548 AMC platform used in RapidIO systems. This module is also used to test/work on mainline linux RapidIO software. PPA8548 overview: - 1.3 GHz Freescale PowerQUICC III MPC8548 processor - 1 GB DDR2 @ 266 MHz - 8 MB NOR flash - Serial RapidIO 1.2 - 1 x 10/100/1000 BASE-T front ethernet - 1 x 1000 BASE-BX ethernet on AMC connector Signed-off-by: Stef van Os <stef.van.os@prodrive.nl> Acked-by: Timur Tabi <timur@tabi.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/fsl: remove extraneous DIU platform functionsTimur Tabi2013-02-153-55/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Freescale DIU driver was recently updated to not require every DIU platform function, so now we can remove the unneeded functions from some boards. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate testJulia Lawall2013-02-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Delete successive tests to the same location. The code tested the result of a previous call, that itself was already tested. It is changed to test the result of the most recent call. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @s exists@ local idexpression y; expression x,e; @@ *if ( \(x == NULL\|IS_ERR(x)\|y != 0\) ) { ... when forall return ...; } ... when != \(y = e\|y += e\|y -= e\|y |= e\|y &= e\|y++\|y--\|&y\) when != \(XT_GETPAGE(...,y)\|WMI_CMD_BUF(...)\) *if ( \(x == NULL\|IS_ERR(x)\|y != 0\) ) { ... when forall return ...; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: dts - add ranges property for SECPo Liu2013-02-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This facilitates getting the physical address of the SEC node. Signed-off-by: Liu po <po.liu@freescale.com> Reviewed-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/83xx: update kmeter1_defconfigHolger Brunck2013-02-131-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Synchronize this defconfig with latest kernel version. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/83xx: apply mpc8360e quirk for kmeter1 only when par_io is presentGerlando Falauto2013-02-131-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no point in applying this quirk when par_io is not present. Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/83xx: refactor mpc8360e quirk for kmeter1Gerlando Falauto2013-02-131-72/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the code for this quirk to a dedicated function. Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/83xx: fix checkpatch warnings for km83xx.cHolger Brunck2013-02-131-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/82xx: fix checkpatch warnings for km82xx.cHolger Brunck2013-02-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: fix various PCI node compatible stringsTimur Tabi2013-02-134-13/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix and/or improve the compatible strings of the PCI device tree nodes for some Freescale SOCs. This fixes some issues and improves consistency among the SOCs. Specifically: 1) The P1022 has a v1 PCIe controller, so the compatible property should just say "fsl,mpc8548-pcie". U-Boot does not look for "fsl,p1022-pcie", so it wasn't fixing up the node. 2) The P4080 has a v2.1 PCIe controller, so add that version-specific string to the device tree. Update the kernel to also look for that string. Currently, the kernel looks for "fsl,p4080-pcie" specifically, but eventually that check should be deleted. 3) The P1010 device tree claims compatibility with v2.2 and v2.3, but that's redundant. No other device tree does this. Remove the v2.2 string. 4) The kernel looks for both "fsl,p1023-pcie" and "fsl,qoriq-pcie-v2.2", even though the P1023 device trees has always included both strings. Remove the search for "fsl,p1023-pcie". Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: describe the PAMU topology in the device treeTimur Tabi2013-02-135-48/+378
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PAMU caches use the LIODNs to determine which cache lines hold the entries for the corresponding LIODs. The LIODNs must therefore be carefully assigned to avoid cache thrashing -- two active LIODs with LIODNs that put them in the same cache line. Currently, LIODNs are statically assigned by U-Boot, but this has limitations. LIODNs are assigned even for devices that may be disabled or unused by the kernel. Static assignments also do not allow for device drivers which may know which LIODs can be used simultaneously. In other words, we really should assign LIODNs dynamically in Linux. To do that, we need to describe the PAMU device and cache topologies in the device trees. Signed-off-by: Timur Tabi <timur@freescale.com> Acked-by: Stuart Yoder <stuart.yoder@freescale.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: enable MTD options in sbc8548 defconfigPaul Gortmaker2013-02-131-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This board has soldered on flash, and a SODIMM flash module. Both can be used for booting, via switching JP12 and SW2.8 and using the sbc8548-altflash.dts when booting from SODIMM. Here we enable MTD in kernel so that we can see the bootloader (and other flash sectors) from linux. Normal configuration: root@sbc8548:~# cat /proc/mtd dev: size erasesize name mtd0: 007a0000 00020000 "space" mtd1: 00060000 00020000 "bootloader" mtd2: 03f00000 00080000 "space" mtd3: 00100000 00080000 "bootloader" root@sbc8548:~# dd if=/dev/mtd1 count=1 bs=48|hexdump -C 1+0 records in 1+0 records out 00000000 27 05 19 56 55 2d 42 6f 6f 74 20 32 30 31 32 2e |'..VU-Boot 2012.| 00000010 31 30 2d 64 69 72 74 79 20 28 4a 61 6e 20 31 39 |10-dirty (Jan 19| 00000020 20 32 30 31 33 20 2d 20 31 39 3a 34 30 3a 31 31 | 2013 - 19:40:11| 00000030 root@sbc8548:~# dd if=/dev/mtd3 count=1 bs=48|hexdump -C 1+0 records in 1+0 records out 00000000 27 05 19 56 55 2d 42 6f 6f 74 20 32 30 31 32 2e |'..VU-Boot 2012.| 00000010 31 30 2d 64 69 72 74 79 20 28 44 65 63 20 31 33 |10-dirty (Dec 13| 00000020 20 32 30 31 32 20 2d 20 31 35 3a 30 30 3a 30 37 | 2012 - 15:00:07| 00000030 root@sbc8548:~# Alternate configuration, with sbc8548-altflash.dts: root@sbc8548:~# cat /proc/mtd dev: size erasesize name mtd0: 03f00000 00080000 "space" mtd1: 00100000 00080000 "bootloader" mtd2: 007a0000 00020000 "space" mtd3: 00060000 00020000 "bootloader" root@sbc8548:~# dd if=/dev/mtd1 count=1 bs=48|hexdump -C 1+0 records in 1+0 records out 00000000 27 05 19 56 55 2d 42 6f 6f 74 20 32 30 31 32 2e |'..VU-Boot 2012.| 00000010 31 30 2d 64 69 72 74 79 20 28 44 65 63 20 31 33 |10-dirty (Dec 13| 00000020 20 32 30 31 32 20 2d 20 31 35 3a 30 30 3a 30 37 | 2012 - 15:00:07| 00000030 root@sbc8548:~# dd if=/dev/mtd3 count=1 bs=48|hexdump -C 1+0 records in 1+0 records out 00000000 27 05 19 56 55 2d 42 6f 6f 74 20 32 30 31 32 2e |'..VU-Boot 2012.| 00000010 31 30 2d 64 69 72 74 79 20 28 4a 61 6e 20 31 39 |10-dirty (Jan 19| 00000020 20 32 30 31 33 20 2d 20 31 39 3a 34 30 3a 31 31 | 2013 - 19:40:11| 00000030 root@sbc8548:~# Note that in the latter, the larger SODIMM device appears 1st, as mtd0 and mtd1, as indicated in the sizes, and in the date of the u-boot image. The kernel configuration is the same in both cases; only the dtb needs to be changed in accordance with the JP12/SW2.8 settings. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: add alternate dts file for sbc8548 boot via SODIMMPaul Gortmaker2013-02-131-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By moving the two JP12 jumpers 90 degrees, and switching the setting of SW2.8, the sbc8548 can be configured to boot off the alternate 64MB SODIMM, which when populated with u-boot can be a handy recovery option, in case the u-boot in the 8MB soldered on flash gets corrupted. Here we add an alternate dts file to match that configuration. To better highlight the differences, the output from the u-boot "fli" command is shown for the normal configuration and then the alternate configuration. Normal: ----------------------- Bank # 1: CFI conformant flash (8 x 8) Size: 8 MB in 64 Sectors Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x17 Erase timeout: 4096 ms, write timeout: 1 ms Buffer write timeout: 2 ms, buffer size: 32 bytes Sector Start Addresses: FF800000 E FF820000 E FF840000 E FF860000 E FF880000 E [...] FFEE0000 E FFF00000 E FFF20000 E FFF40000 E FFF60000 E FFF80000 FFFA0000 RO FFFC0000 RO FFFE0000 RO Bank # 2: CFI conformant flash (32 x 8) Size: 64 MB in 128 Sectors Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x18 Erase timeout: 4096 ms, write timeout: 1 ms Buffer write timeout: 2 ms, buffer size: 32 bytes Sector Start Addresses: EC000000 E EC080000 E EC100000 E EC180000 E EC200000 E [...] EFC00000 E EFC80000 E EFD00000 E EFD80000 E EFE00000 E EFE80000 E EFF00000 EFF80000 ----------------------- Alternate: ----------------------- Bank # 1: CFI conformant flash (32 x 8) Size: 64 MB in 128 Sectors Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x18 Erase timeout: 4096 ms, write timeout: 1 ms Buffer write timeout: 2 ms, buffer size: 32 bytes Sector Start Addresses: FC000000 E FC080000 E FC100000 E FC180000 E FC200000 E [...] FFC00000 E FFC80000 E FFD00000 E FFD80000 E FFE00000 E FFE80000 E FFF00000 RO FFF80000 RO Bank # 2: CFI conformant flash (8 x 8) Size: 8 MB in 64 Sectors Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x17 Erase timeout: 4096 ms, write timeout: 1 ms Buffer write timeout: 2 ms, buffer size: 32 bytes Sector Start Addresses: EF800000 E EF820000 E EF840000 E EF860000 E EF880000 E [...] EFEE0000 E EFF00000 E EFF20000 E EFF40000 E EFF60000 E EFF80000 E EFFA0000 EFFC0000 EFFE0000 ----------------------- Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: update sbc8548 flash information to match recent u-bootPaul Gortmaker2013-02-131-19/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original memory map for the sbc8548 had the 64MB SODIMM flash device misaligned by 8MB to allow a window of address space for the soldered on 8MB device -- i.e. start end CS<n> width Desc. ---------------------------------------------------------- fb80_0000 ff7f_ffff CS6 32 SODIMM flash (64MB) ff80_0000 ffff_ffff CS0 8 Boot flash (8MB) However, if we want to change the configuration so that it boots off the 64MB flash, it is in turn then aligned with a 64MB boundary, starting at fc00_0000 (and the 8MB @ fb80_0000 -> fbff_ffff). This makes for complicated updates, since what is the beginning of the physical device is 8MB into its address space in the default configuration shown above. This issue was fixed as of u-boot commit 3fd673cf363bc86ed42eff713d4 ("sbc8548: relocate 64MB user flash to sane boundary") -- in which the SODIMM was mapped to ec00_0000 (natively aligned under efff_ffff) and so when JP12/SW2.8 are switched, it will be a a simple 0xec --> 0xfc mapping between the two instances. Here we make the associated changes in the localbus flash memory map in the dts file: indicating the 64MB device starts at ec00_0000 and that the tail end of the 64MB device (last 2 sectors) can contain a bootloader image. The partitions for both flash devices get a clean-up; there were non-meaningful assignments in there that probably originated from the MPC8548CDS on which the file was based on. Now there is just the categorization of free space and bootloader images. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>