summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/ipu-v3/ipu-image-convert.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915: Move abs_diff() to math.hAndy Shevchenko2023-08-181-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | abs_diff() belongs to math.h. Move it there. This will allow others to use it. [andriy.shevchenko@linux.intel.com: add abs_diff() documentation] Link: https://lkml.kernel.org/r/20230804050934.83223-1-andriy.shevchenko@linux.intel.com [akpm@linux-foundation.org: fix comment, per Randy] Link: https://lkml.kernel.org/r/20230803131918.53727-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> # tty/serial Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> # gpu/ipu-v3 Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Helge Deller <deller@gmx.de> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* gpu: ipu-v3: image-convert: use swap()Salah Triki2022-04-041-6/+3
| | | | | | | | | Use swap() instead of implementing it since it makes code cleaner. Signed-off-by: Salah Triki <salah.triki@gmail.com> Link: https://lore.kernel.org/r/20210713140521.GA1873885@pc [p.zabel@pengutronix.de: add "image-convert:" prefix to commit description] Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: Wait for all EOFs before completing a tileSteve Longerbeam2020-07-201-27/+82
| | | | | | | | | | | | | | Use a bit-mask of EOF irqs to determine when all required idmac channel EOFs have been received for a tile conversion, and only do tile completion processing after all EOFs have been received. Otherwise it was found that a conversion would stall after the completion of a tile and the start of the next tile, because the input/read idmac channel had not completed and entered idle state, thus locking up the channel when attempting to re-start it for the next tile. Fixes: 0537db801bb01 ("gpu: ipu-v3: image-convert: reconfigure IC per tile") Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: Combine rotate/no-rotate irq handlersSteve Longerbeam2020-07-201-38/+20
| | | | | | | | Combine the rotate_irq() and norotate_irq() handlers into a single eof_irq() handler. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: only sample into the next tile if necessaryPhilipp Zabel2019-08-191-2/+2
| | | | | | | | The first pixel of the next tile is only sampled by the hardware if the fractional input position corresponding to the last written output pixel is not an integer position. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: move tile burst alignment out of loopPhilipp Zabel2019-08-191-39/+45
| | | | | | | | | Burst aligned input and output width can be calculated once per column, instead of repeatedly for each tile in the column. The same goes for input and output height per row. Also don't round up the same values repeatedly. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: bail on invalid tile sizesPhilipp Zabel2019-08-191-3/+24
| | | | | | | | | If we managed to create tiles sized 0x0 because of a bug in the seam calculation, return with an error message instead of letting the driver run into a division by zero later. Also check for tile sizes that are larger than supported by the hardware. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: fix image downsize coefficients and tiling ↵Philipp Zabel2019-08-191-18/+31
| | | | | | | | | | | | | | | | | | | | | calculation This patch effectively reverts commit 912bbf7e9ca4 ("gpu: ipu-v3: image-convert: Fix image downsize coefficients") and replaces it with a different solution based on the preceding patches. The previous fix tried to solve the problem of intermediate tile size between IC downsizing and main processing sections not being limited to 1024 pixels by downsizing the input image to a smaller intermediate size in the downsizing box filter. This causes unnecessary blurring, especially for scaling factors close to 1. Now that the seam position calculation makes sure that the 1024 pixel intermediate tile size limit is not exceeded, calculate the number of tiles from the maximum of intermediate size and output size and avoid unnecessary downsizing. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: limit input seam position to hardware requirementsPhilipp Zabel2019-08-191-6/+24
| | | | | | | | Limit the input seam position to an interval that guarantees the tile size does not exceed 1024 pixels after the IC downsizing section and that space is left for the next tile. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: fix output seam valid intervalPhilipp Zabel2019-08-191-2/+2
| | | | | | | | | | | | This fixes a failure to determine any seam if the output size is exactly 1024 multiplied by the number of tiles in a given direction. In that case an empty interval out_start == out_end is being passed to find_best_seam, which looks for a seam out_start <= x < out_end. Also reduce the interval for all but the left column / top row, to avoid returning position 0 as best fit. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: move output seam valid interval calculation into ↵Philipp Zabel2019-08-191-22/+12
| | | | | | | | | find_best_seam This reduces code duplication and allows to apply the following modifications in a single place. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: enable V4L2_PIX_FMT_BGRX32 and _RGBX32Philipp Zabel2019-08-191-0/+6
| | | | | | | Enable image converter support for V4L2_PIX_FMT_BGRX32 and V4L2_PIX_FMT_RGBX32 pixel formats. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* Merge tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2019-07-161-11/+26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "The biggest thing in this is the AMD Navi GPU support, this again contains a bunch of header files that are large. These are the new AMD RX5700 GPUs that just recently became available. New drivers: - ST-Ericsson MCDE driver - Ingenic JZ47xx SoC UAPI change: - HDR source metadata property Core: - HDR inforframes and EDID parsing - drm hdmi infoframe unpacking - remove prime sg_table caching into dma-buf - New gem vram helpers to reduce driver code - Lots of drmP.h removal - reservation fencing fix - documentation updates - drm_fb_helper_connector removed - mode name command handler rewrite fbcon: - Remove the fbcon notifiers ttm: - forward progress fixes dma-buf: - make mmap call optional - debugfs refcount fixes - dma-fence free with pending signals fix - each dma-buf gets an inode Panels: - Lots of additional panel bindings amdgpu: - initial navi10 support - avoid hw reset - HDR metadata support - new thermal sensors for vega asics - RAS fixes - use HMM rather than MMU notifier - xgmi topology via kfd - SR-IOV fixes - driver reload fixes - DC use a core bpc attribute - Aux fixes for DC - Bandwidth calc updates for DC - Clock handling refactor - kfd VEGAM support vmwgfx: - Coherent memory support changes i915: - HDR Support - HDMI i2c link - Icelake multi-segmented gamma support - GuC firmware update - Mule Creek Canyon PCH support for EHL - EHL platform updtes - move i915.alpha_support to i915.force_probe - runtime PM refactoring - VBT parsing refactoring - DSI fixes - struct mutex dependency reduction - GEM code reorg mali-dp: - Komeda driver features msm: - dsi vs EPROBE_DEFER fixes - msm8998 snapdragon 835 support - a540 gpu support - mdp5 and dpu interconnect support exynos: - drmP.h removal tegra: - misc fixes tda998x: - audio support improvements - pixel repeated mode support - quantisation range handling corrections - HDMI vendor info fix armada: - interlace support fix - overlay/video plane register handling refactor - add gamma support rockchip: - RX3328 support panfrost: - expose perf counters via hidden ioctls vkms: - enumerate CRC sources list ast: - rework BO handling mgag200: - rework BO handling dw-hdmi: - suspend/resume support rcar-du: - R8A774A1 Soc Support - LVDS dual-link mode support - Additional formats - Misc fixes omapdrm: - DSI command mode display support stm - fb modifier support - runtime PM support sun4i: - use vmap ops vc4: - binner bo binding rework v3d: - compute shader support - resync/sync fixes - job management refactoring lima: - NULL pointer in irq handler fix - scheduler default timeout virtio: - fence seqno support - trace events bochs: - misc fixes tc458767: - IRQ/HDP handling sii902x: - HDMI audio support atmel-hlcdc: - misc fixes meson: - zpos support" * tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drm: (1815 commits) Revert "Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux into drm-next" Revert "mm: adjust apply_to_pfn_range interface for dropped token." mm: adjust apply_to_pfn_range interface for dropped token. drm/amdgpu/navi10: add uclk activity sensor drm/amdgpu: properly guard the generic discovery code drm/amdgpu: add missing documentation on new module parameters drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback drm/amd/display: avoid 64-bit division drm/amdgpu/psp11: simplify the ucode register logic drm/amdgpu: properly guard DC support in navi code drm/amd/powerplay: vega20: fix uninitialized variable use drm/amd/display: dcn20: include linux/delay.h amdgpu: make pmu support optional drm/amd/powerplay: Zero initialize current_rpm in vega20_get_fan_speed_percent drm/amd/powerplay: Zero initialize freq in smu_v11_0_get_current_clk_freq drm/amd/powerplay: Use memset to initialize metrics structs drm/amdgpu/mes10.1: Fix header guard drm/amd/powerplay: add temperature sensor support for navi10 drm/amdgpu: fix scheduler timeout calc drm/amdgpu: Prepare for hmm_range_register API change (v2) ...
| * gpu: ipu-v3: image-convert: Enable double write reductionSteve Longerbeam2019-06-141-0/+9
| | | | | | | | | | | | | | | | | | | | For the write channels with 4:2:0 subsampled YUV formats, avoid chroma overdraw by only writing chroma for even lines (skip odd chroma rows). This reduces necessary write memory bandwidth by at least 25% (more with rotation enabled). Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
| * gpu: ipu-v3: ipu-ic: Fully describe colorspace conversionsSteve Longerbeam2019-06-141-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only providing the input and output RGB/YUV space to the IC task init functions is not sufficient. To fully characterize a colorspace conversion, the Y'CbCr encoding standard, and quantization also need to be specified. Define a 'struct ipu_ic_colorspace' that includes all the above. This allows to actually enforce the fact that the IC: - can only encode to/from YUV and RGB full range. A follow-up patch will remove this restriction. - can only encode using BT.601 standard. A follow-up patch will add Rec.709 encoding support. The determination of the CSC coefficients based on the input/output 'struct ipu_ic_colorspace' are moved to a new exported function ipu_ic_calc_csc(), and 'struct ic_csc_params' is exported as 'struct ipu_ic_csc_params'. ipu_ic_calc_csc() fills a 'struct ipu_ic_csc' with the input/output 'struct ipu_ic_colorspace' and the calculated 'struct ic_csc_params' from those input/output colorspaces. The functions ipu_ic_task_init(_rsc)() now take a filled 'struct ipu_ic_csc'. The existing CSC coefficient tables and ipu_ic_calc_csc() are moved to a new module ipu-ic-csc.c. This is in preparation for adding more coefficient tables for limited range quantization and more encoding standards. The existing ycbcr2rgb and inverse rgb2ycbcr tables defined the BT.601 Y'CbCr encoding coefficients. The rgb2ycbcr table specifically described the BT.601 encoding from full range RGB to full range YUV. Table comments have been added in ipu-ic-csc.c to make this more clear. The ycbcr2rgb inverse table described encoding YUV limited range to RGB full range. To be consistent with the rgb2ycbcr table, this table is converted to YUV full range to RGB full range, and the comments are expanded in ipu-ic-csc.c. The ic_csc_rgb2rgb table was just an identity matrix, so it is renamed 'identity' in ipu-ic-csc.c. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> [p.zabel@pengutronix.de: removed a superfluous blank line] Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* | gpu: ipu-v3: image-convert: Fix image downsize coefficientsSteve Longerbeam2019-06-141-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The output of the IC downsizer unit in both dimensions must be <= 1024 before being passed to the IC resizer unit. This was causing corrupted images when: input_dim > 1024, and input_dim / 2 < output_dim < input_dim Some broken examples were 1920x1080 -> 1024x768 and 1920x1080 -> 1280x1080. Fixes: 70b9b6b3bcb21 ("gpu: ipu-v3: image-convert: calculate per-tile resize coefficients") Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* | gpu: ipu-v3: image-convert: Fix input bytesperline for packed formatsSteve Longerbeam2019-06-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The input bytesperline calculation for packed pixel formats was incorrect. The min/max clamping values must be multiplied by the packed bits-per-pixel. This was causing corrupted converted images when the input format was RGB4 (probably also other input packed formats). Fixes: d966e23d61a2c ("gpu: ipu-v3: image-convert: fix bytesperline adjustment") Reported-by: Harsha Manjula Mallikarjun <Harsha.ManjulaMallikarjun@in.bosch.com> Suggested-by: Harsha Manjula Mallikarjun <Harsha.ManjulaMallikarjun@in.bosch.com> Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* | gpu: ipu-v3: image-convert: Fix input bytesperline width/height alignSteve Longerbeam2019-06-141-11/+21
|/ | | | | | | | | | | | The output width and height alignment values were being used in the input bytesperline calculation. Fix by separating local vars w_align and h_align into w_align_in, h_align_in, w_align_out, and h_align_out. Fixes: d966e23d61a2c ("gpu: ipu-v3: image-convert: fix bytesperline adjustment") Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157Thomas Gleixner2019-05-301-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 3 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [graeme] [gregory] [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema] [hk] [hemahk]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1105 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* gpu: ipu-v3: image-convert: allow three rows or columnsPhilipp Zabel2018-11-051-6/+1
| | | | | | | | | If width or height are in the [2049, 3072] range, allow to use just three tiles in this dimension, instead of four. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: disable double buffering if necessaryPhilipp Zabel2018-11-051-2/+25
| | | | | | | | | Double-buffering only works if tile sizes are the same and the resizing coefficient does not change between tiles, even for non-planar formats. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: add some ASCII art to the expositionPhilipp Zabel2018-11-051-10/+29
| | | | | | | | Visualize the scaling and rotation pipeline with some ASCII art diagrams. Remove the FIXME comment about missing seam prevention. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: fix bytesperline adjustmentPhilipp Zabel2018-11-051-4/+12
| | | | | | | | | | | | | | | For planar formats, bytesperline does not depend on BPP. It must always be larger than width and aligned to tile width alignment restrictions. The input bytesperline to ipu_image_convert_adjust() may be uninitialized, so don't rely on input bytesperline as the minimum value for clamp_align(). Use 2 << w_align as the minimum instead. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> [slongerbeam@gmail.com: clamp input bytesperline] Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: relax alignment restrictionsPhilipp Zabel2018-11-051-40/+41
| | | | | | | | | | | | | | | | For the planar but U/V-packed formats NV12 and NV16, 8 pixel width alignment is good enough to fulfill the 8 byte stride requirement. If we allow the input 8-pixel DMA bursts to overshoot the end of the line, the only input alignment restrictions are dictated by the pixel format and 8-byte aligned line start address. Since different tile sizes are allowed, the output tile with / height alignment doesn't need to be multiplied by number of columns / rows. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> [slongerbeam@gmail.com: Bring in the fixes to format width and height alignment restrictions from imx-media-mem2mem.c.] Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: fix debug output for varying tile sizesPhilipp Zabel2018-11-051-2/+10
| | | | | | | | | Since tile dimensions now vary between tiles, add debug output for each tile's position and dimensions. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: select optimal seam positionsPhilipp Zabel2018-11-051-6/+337
| | | | | | | | | | | | | | | | Select seam positions that minimize distortions during seam hiding while satifying input and output IDMAC, rotator, and image format constraints. This code looks for aligned output seam positions that minimize the difference between the fractional corresponding ideal input positions and the input positions rounded to alignment requirements. Since now tiles can be sized differently, alignment restrictions of the complete image can be relaxed in the next step. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: move tile alignment helpersPhilipp Zabel2018-11-051-27/+27
| | | | | | | | | Move tile_width_align and tile_height_align up so they can be used by the tile edge position calculation code. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: calculate tile dimensions and offsets outside ↵Philipp Zabel2018-11-051-5/+13
| | | | | | | | | | | fill_image This will allow to calculate seam positions after initializing the ipu_image base structure but before calculating tile dimensions. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: store tile top/left positionPhilipp Zabel2018-11-051-12/+15
| | | | | | | | | Store tile top/left position in pixels in the tile structure. This will allow overlapping tiles with different sizes later. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: reconfigure IC per tilePhilipp Zabel2018-11-051-21/+44
| | | | | | | | | For differently sized tiles or if the resizing coefficients change, we have to stop, reconfigure, and restart the IC between tiles. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: calculate per-tile resize coefficientsPhilipp Zabel2018-11-051-2/+234
| | | | | | | | | | | | | | | | | Slightly modifying resize coefficients per-tile allows to completely hide the seams between tiles and to sample the correct input pixels at the bottom and right edges of the image. Tiling requires a bilinear interpolator reset at each tile start, which causes the image to be slightly shifted if the starting pixel should not have been sampled from an integer pixel position in the source image according to the full image resizing ratio. To work around this hardware limitation, calculate per-tile resizing coefficients that make sure that the correct input pixels are sampled at the tile end. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: prepare for per-tile configurationPhilipp Zabel2018-11-051-25/+35
| | | | | | | | | Let convert_start start from a given tile index, allocate intermediate tile with maximum tile size. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Steve Longerbeam <slongerbeam@gmail.com>
* gpu: ipu-v3: image-convert: Catch unaligned tile offsetsSteve Longerbeam2018-11-051-24/+37
| | | | | | | | | Catch calculated tile offsets that are not 8-byte aligned as required by the IDMAC engine and return error in calc_tile_offsets(). Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: Remove need_abort flagSteve Longerbeam2018-11-051-4/+1
| | | | | | | | | | The need_abort flag is not really needed anymore in __ipu_image_convert_abort(), remove it. No functional changes. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: Allow reentrancy into abortSteve Longerbeam2018-11-051-3/+4
| | | | | | | | | | | | Allow reentrancy into ipu_image_convert_abort(), by moving re-init of ctx->aborted completion under the spin lock, and only if there is an active run, and complete all waiters do_bh(). Note: ipu_image_convert_unprepare() is still _not_ reentrant, and can't be made reentrant. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: Only wait for abort completion if active runSteve Longerbeam2018-11-051-2/+7
| | | | | | | | | | | Only wait for the ctx->aborted completion if there is an active run in progress, otherwise the wait will just timeout after 10 seconds. If there is no active run in progress, the done queue just needs to be emptied. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: image-convert: Prevent race between run and unprepareSteve Longerbeam2018-11-051-3/+7
| | | | | | | | | | | | | | | | | Prevent possible race by parallel threads between ipu_image_convert_run() and ipu_image_convert_unprepare(). This involves setting ctx->aborting to true unconditionally so that no new job runs can be queued during unprepare, and holding the ctx->aborting flag until the context is freed. Note that the "normal" ipu_image_convert_abort() case (e.g. not during context unprepare) should clear the ctx->aborting flag after aborting any active run and clearing the context's pending queue. This is because it should be possible to continue to use the conversion context and queue more runs after an abort. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: Add chroma plane offset overrides to ipu_cpmem_set_image()Steve Longerbeam2018-11-051-5/+5
| | | | | | | | | | Allow the caller of ipu_cpmem_set_image() to override the latters calculation of the chroma plane offsets, by adding override U/V plane offsets to 'struct ipu_image'. Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: add support for XRGB32 and XBGR32 V4L2 pixel formatsPhilipp Zabel2018-08-021-0/+6
| | | | | | | These should be used instead of the ill-defined deprecated RGB32 and BGR32 V4L2 pixel formats. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: only set non-zero AXI ID for IC when PRG is absentLucas Stach2017-03-161-1/+6
| | | | | | | | | Using non-zero AXI IDs for anything other than the display channels collides with the PRG AXI snooping, so only do this if there is no PRG present. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: Use ERR_CAST instead of ERR_PTR(PTR_ERR())Wei Yongjun2016-10-171-1/+1
| | | | | | | | | Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)). Generated by: scripts/coccinelle/api/err_cast.cocci Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* gpu: ipu-v3: Add queued image conversion supportSteve Longerbeam2016-09-191-0/+1709
This patch implements image conversion support using the IC tasks, with tiling to support scaling to and from images up to 4096x4096. Image rotation is also supported. Image conversion requests are added to a run queue under the IC tasks. The internal API is subsystem agnostic (no V4L2 dependency except for the use of V4L2 fourcc pixel formats). Callers prepare for image conversion by calling ipu_image_convert_prepare(), which initializes the parameters of the conversion. The caller passes in the ipu and IC task to use for the conversion, the input and output image formats, a rotation mode, and a completion callback and completion context pointer: struct ipu_image_converter_ctx * ipu_image_convert_prepare(struct ipu_soc *ipu, enum ipu_ic_task ic_task, struct ipu_image *in, struct ipu_image *out, enum ipu_rotate_mode rot_mode, ipu_image_converter_cb_t complete, void *complete_context); A new conversion context is created that is added to an IC task context queue. The caller is given the new conversion context, which can then be passed to the further APIs: int ipu_image_convert_queue(struct ipu_image_converter_run *run); This queues the given image conversion request run to a run queue, and starts the conversion immediately if the run queue is empty. Only the physaddr's of the input and output image buffers are needed, since the conversion context was created previously with ipu_image_convert_prepare(). When the conversion completes, the run pointer is returned to the completion callback. void ipu_image_convert_abort(struct ipu_image_converter_ctx *ctx); This will abort any active or pending conversions for this context. Any currently active or pending runs belonging to this context are returned via the completion callback with an error status. void ipu_image_convert_unprepare(struct ipu_image_converter_ctx *ctx); Unprepares the conversion context. Any active or pending runs will be aborted by calling ipu_image_convert_abort(). Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>