文章目錄

1 設置DMA映射
- 緩存一致性和DMA
- DMA映射
- - 一致映射
  - 流式DMA映射
2 完成的概念
3 DMA引擎API
- 分配DMA從通道
- 設置從設備和控制器指定參數
- 獲取事務描述符
- 提交事務
- 發布待處理DMA請求并等待回調通知
4 程序
- 單緩沖區映射
- 分散聚集映射

DMA是計算機系統的一項功能，它允許設備在沒有CPU的干預的情況下訪問系統主存儲器RAM，使CPU完成其他任務。

DMA控制器是負責DMA管理的外設，在現代處理器和微控制器中都能發現它。DMA功能用于執行內存讀寫和寫入操作而不占用CPU周期。當需要傳輸數據塊時，處理器向DMA控制器提供源地址和目標地址以及總字節數。然后，DMA控制器會自動將數據從源傳輸到目標，而不會占用CPU周期。

1 設置DMA映射

對于任何類型的DMA傳輸，都需要提供源地址和目標地址，以及需要傳輸的字數。在外設DMA的情況下，外設的FIFO用作源或目標。

使用外設DMA時，根據傳輸方向指定源或目標地址，換句話說，DMA傳輸需要適當的內存映射，這是接下來要討論的內容。

緩存一致性和DMA

想象一種情況，CPU配備了緩存和外部存儲器，設備用DMA可以直接訪問它們。當CPU訪問內存中X時，當前值將被存儲到緩存中，假設采用回寫式緩存，對X的后續操作將更新X的緩存副本，而不更新X的外部存儲器版本。如果在下一次設備嘗試訪問X之前，緩存未刷新到內存，則設備將受到舊的X值，同樣，如果把新值寫入內存時未使緩存的X副本變為無效，那么CPU將操作舊的X值。。

兩個獨立的設備共享內存就會存在這種問題，緩存一致性確保每個寫入操作都是瞬間發生的，共享內存區域的所有設備都會看到完全相同的更改順序。

DMA映射

所有合適的DMA傳輸都需要適當的內存映射，DMA映射包括DMA緩沖區和為其生成總線地址。設備實際上使用總線地址。總線地址是dma_addr_t類型的每個實例。

映射分為兩種類型：一致DMA映射和流式DMA映射。前者用于多次傳輸，它會自動解決緩沖一致性問題。流映射有很多限制，他不會自動解決一致性問題，但有一種解決方案，即在每次傳輸之間調用幾個函數。一致DMA映射通常存在于驅動程序的整個生命周期中，而流映射則通常在DMA傳輸完成時立即取消映射。

處理DMA映射時應該包含的主要頭文件如下：

#include <linux/dma-mapping>

一致映射

下面函數設置一致映射：

void *dma_alloc_coherent(struct device *dev,size_t size,dma_addr_t *dma_handle,gfp_t flag);

此函數處理緩沖區的分配和映射，并返回該緩沖區的內核虛擬地址，該緩沖區大小為size字節，可供CPU訪問。dev時設備結構。dma_handle指向總線地址，分配給映射的內存物理上一定是連續的，flag決定內存的分配方式，通常是GFP_KERNEL或GFP_ATOMIC。

釋放映射可以使用以下函數：

void dma_free_coherent(struct device *dev,size_t size,void *cpu_addr,dma_addr_t dma_handle);

cpu_addr對應dma_alloc_coherent（）返回的內核虛擬地址。這個映射是很寶貴的，它可以分配的最小內存值是一個頁面，它分配的頁面數量只能是2次冪，對于持續存在于設備生命周期內的緩沖區，應該使用這種映射。

流式DMA映射

流式映射有更多的限制，由于以下原因而不同于一致映射。

映射需要使用已分配的緩沖區
映射可以接受幾個分散的不連續緩沖區
映射的緩沖區屬于設備而不屬于CPU。CPU在使用緩沖區之前，應該首先解除映射，這是為了緩存
對于寫入事務，驅動程序應該在映射之前將數據放入緩沖區
必須指定數據移動的方向，只能基于該方向使用數據

為什么在取消映射之前不該訪問緩沖區？原因很簡單：CPU映射是可緩存的。用于流式映射的dma_map_*()系列函數將首先清理與緩沖區相關的緩存（使之無效），在出現相應的dma_unmap_*()之前，CPU不能訪問。

實際上流式映射有兩種形式：

單緩存映射，它只允許單頁映射。
分散/聚集映射，它允許傳遞多個緩沖區（緩沖區分散在內存中）

對于這兩種中的任何一種映射，都應該用include/linux/dma-direction.h中定義的enum dma_data_direction類型符號來指定方向：

enum dma_data_direction{DMA_BIDIRECTIONAL = 0,DMA_TO_DEVICE =1,DMA_FROM_DEVICE = 2,DMA_NONE =3,
};

單緩存區映射
可以使用下面的方法設置單緩存區

dma_addr_t dma_map_single(struct device *dev,void *ptr,size_t size,enum dma_data_direction direction);

ptr是緩沖區的內核虛擬地址，dma_addr_t是設備返回的總線地址，確保使用真正適合需求的方向。
使用下面的函數釋放該映射：

void dma_unmap_single(struct device *dev,dma_addr_t dma_addr,size_t size,enum dma_data_direction direction);

分散/聚集映射
分散/聚集映射是一種特殊類型的流式映射，可以在單個槽中傳輸多個緩沖區區域，而不是單獨映射每個緩沖區并逐個傳輸它們。假設有幾個緩沖區物理上是不連續的，所有這些緩存區都需要同時傳輸到設備或從設備傳輸。

內核將分散的緩沖區表示為struct scatterlist：

struct scatterlist{unsigned long page_link;unsigned int offset;unsigned int length;dma_addr_t dma_address;unsigned int dma_length;
};

為了設置分散列表映射，應該進行如下操作：

分配分散的緩沖區
創建分散列表數組，并使用sg_set_buf()分配的內存填充它。
在該分散列表上調用dma_map_sg（）
一旦完成DMA，就調用dma_unmap_sg（）來取消映射分散列表

u32 *wbuf1,*wbuf2,*wbuf3;
/* 分配分散的緩沖區 */
wbuf1 = kzalloc(SDMA_BUF_SIZE,GFP_DMA);
wbuf2 = kzalloc(SDMA_BUF_SIZE,GFP_DMA);
wbuf3 = kzalloc(SDMA_BUF_SIZE/2,GFP_DMA);/* 創建分散列表數組 */
struct scatterlist sg[3];/* 使用sg_set_buf()分配的內存填充它 */
sg_init_table(sg,3);
sg_set_buf(&sg[0],wbuf1,SDMA_BUF_SIZE);
sg_set_buf(&sg[1],wbuf2,SDMA_BUF_SIZE);
sg_set_buf(&sg[2],wbuf3,SDMA_BUF_SIZE/2);ret = dma_map_sg(NULL,sg,3,DMA_MEM_TO_MEM);

dma_map_sg()和dma_unmap_sg()負責緩存一致性。但是，如果需要使用相同的映射來訪問DMA傳輸之間的數據，則必須以適當的方式在每次傳輸之間同步緩沖區，如果CPU需要訪問緩沖區，可以通過dma_sync_sg_for_cpu（）進行同步，如果設備需要，則調用dma_sync_sg_for_device（）進行同步。單區域映射的函數是dma_sync_single_for_cpu（）和dma_sync_single_for_device（）

2 完成的概念

使用完成需要下面的文件：

#include <linux/completion.h>

完成由struct completion表示，可以動態或靜態創建。

靜態創建生命和初始化如下：

DECLARE_COMPLETION(my_comp);

動態分配如下：

struct completion my_comp;
init_completion(&my_comp);

當驅動程序啟動的工作必須等待某件事情完成時，它只需將等待完成事件傳遞給wait_for_completion（）函數，調用wait_for_completion()任務將會被阻塞

void wait_for_completion(struct completion *comp);

代碼的其他部分確定該完成事件發生時，它可以用以下方式喚醒正在等待該事件的進程：

void complete(struct completion *comp);
void cimplete_all(struct completion *comp);

complete只會喚醒一個等待進程，而complete_all（）會喚醒等待該事件的所有進程。

3 DMA引擎API

DMA引擎是開發DMA控制器驅動程序的通用內核框架。DMA的主要目標是在復制內存時減輕CPU的負擔。使用通道將I/O數據傳輸委托給DMA引擎，DMA引擎通過其驅動程序提供一組可供其他設備（從設備）使用的通道。

必須引用頭文件如下：

#include <linux/dmaengine.h>

從設備DMA用法很簡單，它包含如下步驟：

分配DMA從通道
設置從設備和控制器的特定參數
獲取事務的描述符
提交事物
發出掛起的請求并等待回調通知

分配DMA從通道

使用dma_request_channel()請求通道：

struct dma_chan * dma_request_channel(const dma_cap_mask_t *mask,dma_filter_fn fn,void *fn_param);

mask是為了指定驅動程序需要執行的傳輸類型，表示該通道必須滿足的功能，是位圖掩碼

dma_filter_fn定義為：

typedef bool （*dma_filter_fn)(struct dma_chan *chan,void *filter_param);

如果filter_fn參數為NULL，則dma_request_channel將只返回滿足功能掩碼的第一個通道。否則，當掩碼參數不足以指定所需的通道時，可以使用filter_fn作為系統中可用通道的過濾器。

通過此接口分配的通道由調用者獨占，直到調用dma_release_channel（）為止：

void dma_release_channel(struct dma_chan *chan);

設置從設備和控制器指定參數

此步驟引入了新的數據結構struct dma_slave_config，它表示DMA從通道的運行配置。這允許為外設指定設置。

int dmaengine_slave_config(struct dma_chan *chan,struct dma_slave_config *config);

/*** struct dma_slave_config - dma slave channel runtime config* @direction: whether the data shall go in or out on this slave* channel, right now. DMA_MEM_TO_DEV and DMA_DEV_TO_MEM are* legal values. DEPRECATED, drivers should use the direction argument* to the device_prep_slave_sg and device_prep_dma_cyclic functions or* the dir field in the dma_interleaved_template structure.* @src_addr: this is the physical address where DMA slave data* should be read (RX), if the source is memory this argument is* ignored.* @dst_addr: this is the physical address where DMA slave data* should be written (TX), if the source is memory this argument* is ignored.* @src_addr_width: this is the width in bytes of the source (RX)* register where DMA data shall be read. If the source* is memory this may be ignored depending on architecture.* Legal values: 1, 2, 3, 4, 8, 16, 32, 64, 128.* @dst_addr_width: same as src_addr_width but for destination* target (TX) mutatis mutandis.* @src_maxburst: the maximum number of words (note: words, as in* units of the src_addr_width member, not bytes) that can be sent* in one burst to the device. Typically something like half the* FIFO depth on I/O peripherals so you don't overflow it. This* may or may not be applicable on memory sources.* @dst_maxburst: same as src_maxburst but for destination target* mutatis mutandis.* @src_port_window_size: The length of the register area in words the data need* to be accessed on the device side. It is only used for devices which is using* an area instead of a single register to receive the data. Typically the DMA* loops in this area in order to transfer the data.* @dst_port_window_size: same as src_port_window_size but for the destination* port.* @device_fc: Flow Controller Settings. Only valid for slave channels. Fill* with 'true' if peripheral should be flow controller. Direction will be* selected at Runtime.* @slave_id: Slave requester id. Only valid for slave channels. The dma* slave peripheral will have unique id as dma requester which need to be* pass as slave config.* @peripheral_config: peripheral configuration for programming peripheral* for dmaengine transfer* @peripheral_size: peripheral configuration buffer size** This struct is passed in as configuration data to a DMA engine* in order to set up a certain channel for DMA transport at runtime.* The DMA device/engine has to provide support for an additional* callback in the dma_device structure, device_config and this struct* will then be passed in as an argument to the function.** The rationale for adding configuration information to this struct is as* follows: if it is likely that more than one DMA slave controllers in* the world will support the configuration option, then make it generic.* If not: if it is fixed so that it be sent in static from the platform* data, then prefer to do that.*/
struct dma_slave_config {enum dma_transfer_direction direction;phys_addr_t src_addr;phys_addr_t dst_addr;enum dma_slave_buswidth src_addr_width;enum dma_slave_buswidth dst_addr_width;u32 src_maxburst;u32 dst_maxburst;u32 src_port_window_size;u32 dst_port_window_size;bool device_fc;unsigned int slave_id;void *peripheral_config;size_t peripheral_size;
};

獲取事務描述符

在獲取DMA通道時，返回值是struct dma_chan結構的實例，它包含struct dma_device*device字段，它就是提供該通道的DMA控制器，此控制器內核驅動提供一組函數來準備DMA事務：

/*** struct dma_device - info on the entity supplying DMA services* @chancnt: how many DMA channels are supported* @privatecnt: how many DMA channels are requested by dma_request_channel* @channels: the list of struct dma_chan* @global_node: list_head for global dma_device_list* @filter: information for device/slave to filter function/param mapping* @cap_mask: one or more dma_capability flags* @desc_metadata_modes: supported metadata modes by the DMA device* @max_xor: maximum number of xor sources, 0 if no capability* @max_pq: maximum number of PQ sources and PQ-continue capability* @copy_align: alignment shift for memcpy operations* @xor_align: alignment shift for xor operations* @pq_align: alignment shift for pq operations* @fill_align: alignment shift for memset operations* @dev_id: unique device ID* @dev: struct device reference for dma mapping api* @owner: owner module (automatically set based on the provided dev)* @src_addr_widths: bit mask of src addr widths the device supports*	Width is specified in bytes, e.g. for a device supporting*	a width of 4 the mask should have BIT(4) set.* @dst_addr_widths: bit mask of dst addr widths the device supports* @directions: bit mask of slave directions the device supports.*	Since the enum dma_transfer_direction is not defined as bit flag for*	each type, the dma controller should set BIT(<TYPE>) and same*	should be checked by controller as well* @min_burst: min burst capability per-transfer* @max_burst: max burst capability per-transfer* @max_sg_burst: max number of SG list entries executed in a single burst*	DMA tansaction with no software intervention for reinitialization.*	Zero value means unlimited number of entries.* @residue_granularity: granularity of the transfer residue reported*	by tx_status* @device_alloc_chan_resources: allocate resources and return the*	number of allocated descriptors* @device_router_config: optional callback for DMA router configuration* @device_free_chan_resources: release DMA channel's resources* @device_prep_dma_memcpy: prepares a memcpy operation* @device_prep_dma_xor: prepares a xor operation* @device_prep_dma_xor_val: prepares a xor validation operation* @device_prep_dma_pq: prepares a pq operation* @device_prep_dma_pq_val: prepares a pqzero_sum operation* @device_prep_dma_memset: prepares a memset operation* @device_prep_dma_memset_sg: prepares a memset operation over a scatter list* @device_prep_dma_interrupt: prepares an end of chain interrupt operation* @device_prep_slave_sg: prepares a slave dma operation* @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.*	The function takes a buffer of size buf_len. The callback function will*	be called after period_len bytes have been transferred.* @device_prep_interleaved_dma: Transfer expression in a generic way.* @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address* @device_caps: May be used to override the generic DMA slave capabilities*	with per-channel specific ones* @device_config: Pushes a new configuration to a channel, return 0 or an error*	code* @device_pause: Pauses any transfer happening on a channel. Returns*	0 or an error code* @device_resume: Resumes any transfer on a channel previously*	paused. Returns 0 or an error code* @device_terminate_all: Aborts all transfers on a channel. Returns 0*	or an error code* @device_synchronize: Synchronizes the termination of a transfers to the*  current context.* @device_tx_status: poll for transaction completion, the optional*	txstate parameter can be supplied with a pointer to get a*	struct with auxiliary transfer status information, otherwise the call*	will just return a simple status code* @device_issue_pending: push pending transactions to hardware* @descriptor_reuse: a submitted transfer can be resubmitted after completion* @device_release: called sometime atfer dma_async_device_unregister() is*     called and there are no further references to this structure. This*     must be implemented to free resources however many existing drivers*     do not and are therefore not safe to unbind while in use.* @dbg_summary_show: optional routine to show contents in debugfs; default code*     will be used when this is omitted, but custom code can show extra,*     controller specific information.*/
struct dma_device {struct kref ref;unsigned int chancnt;unsigned int privatecnt;struct list_head channels;struct list_head global_node;struct dma_filter filter;dma_cap_mask_t  cap_mask;enum dma_desc_metadata_mode desc_metadata_modes;unsigned short max_xor;unsigned short max_pq;enum dmaengine_alignment copy_align;enum dmaengine_alignment xor_align;enum dmaengine_alignment pq_align;enum dmaengine_alignment fill_align;#define DMA_HAS_PQ_CONTINUE (1 << 15)int dev_id;struct device *dev;struct module *owner;struct ida chan_ida;struct mutex chan_mutex;	/* to protect chan_ida */u32 src_addr_widths;u32 dst_addr_widths;u32 directions;u32 min_burst;u32 max_burst;u32 max_sg_burst;bool descriptor_reuse;enum dma_residue_granularity residue_granularity;int (*device_alloc_chan_resources)(struct dma_chan *chan);int (*device_router_config)(struct dma_chan *chan);void (*device_free_chan_resources)(struct dma_chan *chan);struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,size_t len, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_xor)(struct dma_chan *chan, dma_addr_t dst, dma_addr_t *src,unsigned int src_cnt, size_t len, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_xor_val)(struct dma_chan *chan, dma_addr_t *src,	unsigned int src_cnt,size_t len, enum sum_check_flags *result, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_pq)(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,unsigned int src_cnt, const unsigned char *scf,size_t len, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_pq_val)(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,unsigned int src_cnt, const unsigned char *scf, size_t len,enum sum_check_flags *pqres, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_memset)(struct dma_chan *chan, dma_addr_t dest, int value, size_t len,unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_memset_sg)(struct dma_chan *chan, struct scatterlist *sg,unsigned int nents, int value, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(struct dma_chan *chan, unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_slave_sg)(struct dma_chan *chan, struct scatterlist *sgl,unsigned int sg_len, enum dma_transfer_direction direction,unsigned long flags, void *context);struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,size_t period_len, enum dma_transfer_direction direction,unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(struct dma_chan *chan, struct dma_interleaved_template *xt,unsigned long flags);struct dma_async_tx_descriptor *(*device_prep_dma_imm_data)(struct dma_chan *chan, dma_addr_t dst, u64 data,unsigned long flags);void (*device_caps)(struct dma_chan *chan,struct dma_slave_caps *caps);int (*device_config)(struct dma_chan *chan,struct dma_slave_config *config);int (*device_pause)(struct dma_chan *chan);int (*device_resume)(struct dma_chan *chan);int (*device_terminate_all)(struct dma_chan *chan);void (*device_synchronize)(struct dma_chan *chan);enum dma_status (*device_tx_status)(struct dma_chan *chan,dma_cookie_t cookie,struct dma_tx_state *txstate);void (*device_issue_pending)(struct dma_chan *chan);void (*device_release)(struct dma_device *dev);/* debugfs support */void (*dbg_summary_show)(struct seq_file *s, struct dma_device *dev);struct dentry *dbg_dev_root;
};

比如device_prep_dma_memcpy()函數: 準備memcpy操作。其中有些函數都會返回指向struct dma_async_tx_descriptor結構的指針，struct dma_async_tx_descriptor表示我們想操作的事務（執行這些函數）

/*** struct dma_async_tx_descriptor - async transaction descriptor* ---dma generic offload fields---* @cookie: tracking cookie for this transaction, set to -EBUSY if*	this tx is sitting on a dependency list* @flags: flags to augment operation preparation, control completion, and*	communicate status* @phys: physical address of the descriptor* @chan: target channel for this operation* @tx_submit: accept the descriptor, assign ordered cookie and mark the* descriptor pending. To be pushed on .issue_pending() call* @callback: routine to call after this operation is complete* @callback_param: general parameter to pass to the callback routine* @desc_metadata_mode: core managed metadata mode to protect mixed use of*	DESC_METADATA_CLIENT or DESC_METADATA_ENGINE. Otherwise*	DESC_METADATA_NONE* @metadata_ops: DMA driver provided metadata mode ops, need to be set by the*	DMA driver if metadata mode is supported with the descriptor* ---async_tx api specific fields---* @next: at completion submit this descriptor* @parent: pointer to the next level up in the dependency chain* @lock: protect the parent and next pointers*/
struct dma_async_tx_descriptor {dma_cookie_t cookie;enum dma_ctrl_flags flags; /* not a 'long' to pack with cookie */dma_addr_t phys;struct dma_chan *chan;dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);int (*desc_free)(struct dma_async_tx_descriptor *tx);dma_async_tx_callback callback;dma_async_tx_callback_result callback_result;void *callback_param;struct dmaengine_unmap_data *unmap;enum dma_desc_metadata_mode desc_metadata_mode;struct dma_descriptor_metadata_ops *metadata_ops;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCHstruct dma_async_tx_descriptor *next;struct dma_async_tx_descriptor *parent;spinlock_t lock;
#endif
};

callback域是操作完成后調用的例程，賦值是一個函數名，

typedef void (*dma_async_tx_callback)(void *dma_async_param);

比如執行完device_prep_dma_memcpy()就會調用callback

提交事務

要將事務放入驅動程序的等待列表中，需要調用dmaengine_submit()。一旦準備好描述符并添加回調信息，就應該將其放在DMA引擎驅動程序等待隊列中：

dma_cookie_t dmaengine_submit(struct dma_async_tx_descriptor *desc);

該函數會返回一個cooke，可以通過其他DMA引擎檢查DMA活動的進度。dmaengine_submit() 不會啟動DMA操作，它只是將其添加到待處理隊列中。

發布待處理DMA請求并等待回調通知

啟動事務是DMA傳輸設置的最后一步，在通道上調用dma_async_issue_pending()來激活通道待處理隊列中的事務。如果通道空閑，則隊列中的第一個事務將啟動，后續事務排隊等候。DMA操作完成時，隊列中的下一個事務啟動，并觸發軟中斷(tasklet)。如果已經設置，則該tasklet負責調用客戶端驅動程序的完成回調例程進行通知

void dma_async_issue_pending(struct dma_chan*chan);

4 程序

單緩沖區映射

/** Copyright 2006-2014 Freescale Semiconductor, Inc. All rights reserved.*//** The code contained herein is licensed under the GNU General Public* License. You may obtain a copy of the GNU General Public License* Version 2 or later at the following locations:** http://www.opensource.org/licenses/gpl-license.html* http://www.gnu.org/copyleft/gpl.html*/#include <linux/module.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/dma-mapping.h>
#include <linux/fs.h>
#include <linux/version.h>
#include <linux/platform_data/dma-imx.h>
#include <linux/dmaengine.h>
#include <linux/device.h>
#include <linux/io.h>
#include <linux/delay.h>static int gMajor; /* major number of device */
static struct class *dma_tm_class;
u32 *wbuf;
u32 *rbuf;struct dma_chan *dma_m2m_chan;
struct completion dma_m2m_ok;/** For single mapping, the buffer size does not need to be a multiple of page* size.**/
#define SDMA_BUF_SIZE  1024static bool dma_m2m_filter(struct dma_chan *chan, void *param)
{if (!imx_dma_is_general_purpose(chan))return false;chan->private = param;return true;
}int sdma_open(struct inode * inode, struct file * filp)
{dma_cap_mask_t dma_m2m_mask;struct imx_dma_data m2m_dma_data = {0};init_completion(&dma_m2m_ok);/* Initialize capabilities */dma_cap_zero(dma_m2m_mask);dma_cap_set(DMA_MEMCPY, dma_m2m_mask);m2m_dma_data.peripheral_type = IMX_DMATYPE_MEMORY;m2m_dma_data.priority = DMA_PRIO_HIGH;/* 1. 從一個DMA控制器分配DMA從通道 */dma_m2m_chan = dma_request_channel(dma_m2m_mask, dma_m2m_filter, &m2m_dma_data);if (!dma_m2m_chan) {pr_err("Error opening the SDMA memory to memory channel\n");return -EINVAL;} else {pr_info("opened channel %d, req lin %d\n", dma_m2m_chan->chan_id, m2m_dma_data.dma_request);}/* 分配緩存 */wbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!wbuf) {pr_err("error wbuf !!!!!!!!!!!\n");return -1;}rbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!rbuf) {pr_err("error rbuf !!!!!!!!!!!\n");return -1;}return 0;
}int sdma_release(struct inode * inode, struct file * filp)
{dma_release_channel(dma_m2m_chan);dma_m2m_chan = NULL;kfree(wbuf);kfree(rbuf);return 0;
}ssize_t sdma_read (struct file *filp, char __user * buf, size_t count,loff_t * offset)
{int i;
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,0,35))for (i=0; i<SDMA_BUF_SIZE/4; i++) {if (*(rbuf+i) != *(wbuf+i)) {pr_err("Single DMA buffer copy falled!,r=%x,w=%x,%d\n", *(rbuf+i), *(wbuf+i), i);return 0;}}pr_info("buffer copy passed!\n");
#endifreturn 0;
}static void dma_m2m_callback(void *data)
{pr_info("in %s\n",__func__);complete(&dma_m2m_ok);return ;
}ssize_t sdma_write(struct file * filp, const char __user * buf, size_t count,loff_t * offset)
{u32 *index, i;struct dma_slave_config dma_m2m_config = {0};struct dma_async_tx_descriptor *dma_m2m_desc;dma_addr_t dma_src, dma_dst;dma_cookie_t cookie;index = wbuf;for (i=0; i<SDMA_BUF_SIZE/4; i++) {*(index + i) = 0x56565656;}/* 2- Set slave and controller specific parameters */dma_m2m_config.direction = DMA_MEM_TO_MEM;  //指定DMA方向dma_m2m_config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; //指定總線寬度dmaengine_slave_config(dma_m2m_chan, &dma_m2m_config);/* dma_src = dma_map_single(NULL, wbuf, SDMA_BUF_SIZE, dma_m2m_config.direction); *//* dma_dst = dma_map_single(NULL, rbuf, SDMA_BUF_SIZE, dma_m2m_config.direction); *//* DMA映射：分配DMA緩沖區，為其生成總線地址 */dma_src = dma_map_single(NULL, wbuf, SDMA_BUF_SIZE, DMA_TO_DEVICE);dma_dst = dma_map_single(NULL, rbuf, SDMA_BUF_SIZE, DMA_FROM_DEVICE);/* 3- Get a descriptor for the transaction. */dma_m2m_desc = dma_m2m_chan->device->device_prep_dma_memcpy(dma_m2m_chan, dma_dst, dma_src, SDMA_BUF_SIZE,0);dma_m2m_desc->callback = dma_m2m_callback;if (dma_m2m_desc)pr_info("Got a DMA descriptor\n");elsepr_err("error in prep_dma_sg\n");/* 4- Submit the transaction */cookie = dmaengine_submit(dma_m2m_desc);pr_info("Got this cookie: %d\n", cookie);/* 5- Issue pending DMA requests and wait for callback notification */dma_async_issue_pending(dma_m2m_chan);pr_info("waiting for DMA transaction...\n");/* One can use wait_for_completion_timeout() also */wait_for_completion(&dma_m2m_ok);/* dma_unmap_single(NULL, dma_src, SDMA_BUF_SIZE, dma_m2m_config.direction); *//* dma_unmap_single(NULL, dma_dst, SDMA_BUF_SIZE, dma_m2m_config.direction); */dma_unmap_single(NULL, dma_src, SDMA_BUF_SIZE, DMA_TO_DEVICE);dma_unmap_single(NULL, dma_dst, SDMA_BUF_SIZE, DMA_FROM_DEVICE);return count;
}struct file_operations dma_fops = {open:       sdma_open,release:    sdma_release,read:       sdma_read,write:      sdma_write,
};int __init sdma_init_module(void)
{struct device *temp_class;int error;/* register a character device */error = register_chrdev(0, "sdma_test", &dma_fops);if (error < 0) {pr_err("SDMA test driver can't get major number\n");return error;}gMajor = error;pr_info("SDMA test major number = %d\n",gMajor);dma_tm_class = class_create(THIS_MODULE, "sdma_test");if (IS_ERR(dma_tm_class)) {pr_err(KERN_ERR "Error creating sdma test module class.\n");unregister_chrdev(gMajor, "sdma_test");return PTR_ERR(dma_tm_class);}temp_class = device_create(dma_tm_class, NULL,MKDEV(gMajor, 0), NULL, "sdma_test");if (IS_ERR(temp_class)) {pr_err(KERN_ERR "Error creating sdma test class device.\n");class_destroy(dma_tm_class);unregister_chrdev(gMajor, "sdma_test");return -1;}pr_info("SDMA test Driver Module loaded\n");return 0;
}static void sdma_cleanup_module(void)
{unregister_chrdev(gMajor, "sdma_test");device_destroy(dma_tm_class, MKDEV(gMajor, 0));class_destroy(dma_tm_class);pr_info("SDMA test Driver Module Unloaded\n");
}module_init(sdma_init_module);
module_exit(sdma_cleanup_module);MODULE_AUTHOR("Freescale Semiconductor");
MODULE_AUTHOR("John Madieu <john.madieu@gmail.com>");
MODULE_DESCRIPTION("SDMA test driver");
MODULE_LICENSE("GPL");

應用層執行open函數是會分配一個從DMA通道，并且分配兩個緩沖區wbuf，rbuf，wbuf是我們要傳輸的數據，將數據傳輸到rbuf
應用層執行write會依次執行下面的步驟：

調用dmaengine_slave_config設置從設備和控制器指定參數
獲取事務描述符：dma_m2m_desc ，并指定回調函數，dma_m2m_desc 是通過調用通道(dma_m2m_chan)里面的device域中的函數返回的
調用dmaengine_submit提交事務
調用dma_async_issue_pending函數激活事務
調用wait_for_completion來等待事務完成（完成后會自動調用dma_m2m_desc 的回調函數callback ），會執行complete函數，程序會繼續執行下去。

應用層調用read函數會判斷傳輸的數據對不對。

分散聚集映射

/** Copyright 2006-2014 Freescale Semiconductor, Inc. All rights reserved.*//** The code contained herein is licensed under the GNU General Public* License. You may obtain a copy of the GNU General Public License* Version 2 or later at the following locations:** http://www.opensource.org/licenses/gpl-license.html* http://www.gnu.org/copyleft/gpl.html*/#include <linux/module.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/dma-mapping.h>
#include <linux/fs.h>
#include <linux/version.h>
#include <linux/delay.h>
#include <linux/platform_data/dma-imx.h>
#include <asm/mach/dma.h>
#include <linux/dmaengine.h>
#include <linux/device.h>#include <linux/io.h>
#include <linux/delay.h>static int gMajor; /* major number of device */
static struct class *dma_tm_class;
u32 *wbuf, *wbuf2, *wbuf3;
u32 *rbuf, *rbuf2, *rbuf3;struct dma_chan *dma_m2m_chan;
struct completion dma_m2m_ok;
struct scatterlist sg[3], sg2[3];/** There is an errata here in the book.* This should be 1024*16 instead of 1024*/
#define SDMA_BUF_SIZE  1024*16static bool dma_m2m_filter(struct dma_chan *chan, void *param)
{if (!imx_dma_is_general_purpose(chan))return false;chan->private = param;return true;
}int sdma_open(struct inode * inode, struct file * filp)
{dma_cap_mask_t dma_m2m_mask;struct imx_dma_data m2m_dma_data = {0};init_completion(&dma_m2m_ok);/* Initialize capabilities */dma_cap_zero(dma_m2m_mask);dma_cap_set(DMA_MEMCPY, dma_m2m_mask);m2m_dma_data.peripheral_type = IMX_DMATYPE_MEMORY;m2m_dma_data.priority = DMA_PRIO_HIGH;/* 1- Allocate a DMA slave channel. */dma_m2m_chan = dma_request_channel(dma_m2m_mask, dma_m2m_filter, &m2m_dma_data);if (!dma_m2m_chan) {pr_info("Error opening the SDMA memory to memory channel\n");return -EINVAL;} else {pr_info("opened channel %d, req lin %d\n", dma_m2m_chan->chan_id, m2m_dma_data.dma_request);}wbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!wbuf) {pr_info("error wbuf !!!!!!!!!!!\n");return -1;}wbuf2 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!wbuf2) {pr_info("error wbuf2 !!!!!!!!!!!\n");return -1;}wbuf3 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!wbuf3) {pr_info("error wbuf3 !!!!!!!!!!!\n");return -1;}rbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!rbuf) {pr_info("error rbuf !!!!!!!!!!!\n");return -1;}rbuf2 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!rbuf2) {pr_info("error rbuf2 !!!!!!!!!!!\n");return -1;}rbuf3 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);if(!rbuf3) {pr_info("error rbuf3 !!!!!!!!!!!\n");return -1;}return 0;
}int sdma_release(struct inode * inode, struct file * filp)
{dma_release_channel(dma_m2m_chan);dma_m2m_chan = NULL;kfree(wbuf);kfree(wbuf2);kfree(wbuf3);kfree(rbuf);kfree(rbuf2);kfree(rbuf3);return 0;
}ssize_t sdma_read (struct file *filp, char __user * buf, size_t count,loff_t * offset)
{int i;for (i=0; i<SDMA_BUF_SIZE/4; i++) {if (*(rbuf+i) != *(wbuf+i)) {pr_info("buffer 1 copy falled!\n");return 0;}}pr_info("buffer 1 copy passed!\n");for (i=0; i<SDMA_BUF_SIZE/2/4; i++) {if (*(rbuf2+i) != *(wbuf2+i)) {pr_err("buffer 2 copy falled!\n");return 0;}}pr_info("buffer 2 copy passed!\n");for (i=0; i<SDMA_BUF_SIZE/4; i++) {if (*(rbuf3+i) != *(wbuf3+i)) {pr_info("buffer 3 copy falled!\n");return 0;}}pr_info("buffer 3 copy passed!\n");return 0;
}static void dma_m2m_callback(void *data)
{pr_info("in %s\n",__func__);complete(&dma_m2m_ok);return ;
}ssize_t sdma_write(struct file * filp, const char __user * buf, size_t count,loff_t * offset)
{u32 *index1, *index2, *index3, i, ret;struct dma_slave_config dma_m2m_config = {0};struct dma_async_tx_descriptor *dma_m2m_desc;dma_cookie_t cookie;struct timeval end_time;unsigned long end, start;index1 = wbuf;index2 = wbuf2;index3 = wbuf3;for (i=0; i<SDMA_BUF_SIZE/4; i++) {*(index1 + i) = 0x12121212;}for (i=0; i<SDMA_BUF_SIZE/4; i++) {*(index2 + i) = 0x34343434;}for (i=0; i<SDMA_BUF_SIZE/4; i++) {*(index3 + i) = 0x56565656;}#if 0for (i=0; i<SDMA_BUF_SIZE/4; i++) {pr_info("input data_%d : %x\n", i, *(wbuf+i));}for (i=0; i<SDMA_BUF_SIZE/2/4; i++) {pr_info("input data2_%d : %x\n", i, *(wbuf2+i));}for (i=0; i<SDMA_BUF_SIZE/4; i++) {pr_info("input data3_%d : %x\n", i, *(wbuf3+i));}
#endif/* 2- Set slave and controller specific parameters */dma_m2m_config.direction = DMA_MEM_TO_MEM;dma_m2m_config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;dmaengine_slave_config(dma_m2m_chan, &dma_m2m_config);sg_init_table(sg, 3);sg_set_buf(&sg[0], wbuf, SDMA_BUF_SIZE);sg_set_buf(&sg[1], wbuf2, SDMA_BUF_SIZE);sg_set_buf(&sg[2], wbuf3, SDMA_BUF_SIZE);ret = dma_map_sg(NULL, sg, 3, dma_m2m_config.direction);sg_init_table(sg2, 3);sg_set_buf(&sg2[0], rbuf, SDMA_BUF_SIZE);sg_set_buf(&sg2[1], rbuf2, SDMA_BUF_SIZE);sg_set_buf(&sg2[2], rbuf3, SDMA_BUF_SIZE);ret = dma_map_sg(NULL, sg2, 3, dma_m2m_config.direction);/* 3- Get a descriptor for the transaction. */dma_m2m_desc = dma_m2m_chan->device->device_prep_dma_sg(dma_m2m_chan, sg2, 3, sg, 3, DMA_MEM_TO_MEM);dma_m2m_desc->callback = dma_m2m_callback;if (dma_m2m_desc)pr_info("Got a DMA descriptor\n");elsepr_info("error in prep_dma_sg\n");do_gettimeofday(&end_time);start = end_time.tv_sec*1000000 + end_time.tv_usec;/* 4- Submit the transaction */cookie = dmaengine_submit(dma_m2m_desc);pr_info("Got this cookie: %d\n", cookie);/* 5- Issue pending DMA requests and wait for callback notification */dma_async_issue_pending(dma_m2m_chan);pr_info("waiting for DMA transaction...\n");/* One can use wait_for_completion_timeout() also */wait_for_completion(&dma_m2m_ok);do_gettimeofday(&end_time);end = end_time.tv_sec*1000000 + end_time.tv_usec;pr_info("end - start = %d\n", end - start);/* Once the transaction is done, we need to  */dma_unmap_sg(NULL, sg, 3, dma_m2m_config.direction);dma_unmap_sg(NULL, sg2, 3, dma_m2m_config.direction);return count;
}struct file_operations dma_fops = {open:       sdma_open,release:    sdma_release,read:       sdma_read,write:      sdma_write,
};int __init sdma_init_module(void)
{struct device *temp_class;int error;/* register a character device */error = register_chrdev(0, "sdma_test", &dma_fops);if (error < 0) {pr_info("SDMA test driver can't get major number\n");return error;}gMajor = error;pr_info("SDMA test major number = %d\n",gMajor);dma_tm_class = class_create(THIS_MODULE, "sdma_test");if (IS_ERR(dma_tm_class)) {pr_info(KERN_ERR "Error creating sdma test module class.\n");unregister_chrdev(gMajor, "sdma_test");return PTR_ERR(dma_tm_class);}temp_class = device_create(dma_tm_class, NULL,MKDEV(gMajor, 0), NULL, "sdma_test");if (IS_ERR(temp_class)) {pr_info(KERN_ERR "Error creating sdma test class device.\n");class_destroy(dma_tm_class);unregister_chrdev(gMajor, "sdma_test");return -1;}pr_info("SDMA test Driver Module loaded\n");return 0;
}static void sdma_cleanup_module(void)
{unregister_chrdev(gMajor, "sdma_test");device_destroy(dma_tm_class, MKDEV(gMajor, 0));class_destroy(dma_tm_class);pr_info("SDMA test Driver Module Unloaded\n");
}module_init(sdma_init_module);
module_exit(sdma_cleanup_module);MODULE_AUTHOR("Freescale Semiconductor");
MODULE_AUTHOR("John Madieu <john.madieu@gmail.com>");
MODULE_DESCRIPTION("SDMA test driver");
MODULE_LICENSE("GPL");