參考鏈接

FFMPEG結構體分析：AVFrame_雷霄驊的博客-CSDN博客

AVFrame

AVFrame是包含碼流參數較多的結構體
結構體的定義位于frame.h
AVFrame結構體一般用于存儲原始數據（即非壓縮數據，例如對視頻來說是YUV，RGB，對音頻來說是PCM），此外還包含了一些相關的信息。比如說，解碼的時候存儲了宏塊類型表，QP表，運動矢量表等數據。編碼的時候也存儲了相關的數據。因此在使用FFMPEG進行碼流分析的時候，AVFrame是一個很重要的結構體。
重要參數介紹
- uint8_t *data[AV_NUM_DATA_POINTERS]：解碼后原始數據（對視頻來說是YUV，RGB，對音頻來說是PCM）
- int linesize[AV_NUM_DATA_POINTERS]：data中“一行”數據的大小。注意：未必等于圖像的寬，一般大于圖像的寬。
- int width, height：視頻幀寬和高（1920x1080,1280x720...）
- int nb_samples：音頻的一個AVFrame中可能包含多個音頻幀，在此標記包含了幾個
- int format：解碼后原始數據類型（YUV420，YUV422，RGB24...）
- int key_frame：是否是關鍵幀
- enum AVPictureType pict_type：幀類型（I,B,P...）
- AVRational sample_aspect_ratio：寬高比（16:9，4:3...）
- int64_t pts：顯示時間戳
- int coded_picture_number：編碼幀序號
- int display_picture_number：顯示幀序號?
- int interlaced_frame：是否是隔行掃描

變量介紹

data[]

對于packed格式的數據（例如RGB24），會存到data[0]里面。
對于planar格式的數據（例如YUV420P），則會分開成data[0]，data[1]，data[2]...（YUV420P中data[0]存Y，data[1]存U，data[2]存V）
參考鏈接：FFMPEG 實現 YUV，RGB各種圖像原始數據之間的轉換（swscale）_雷霄驊的博客-CSDN博客_ffmpeg yuv轉rgb

pict_type

包含以下類型：
文件位于<avutil.h>

/*** @}* @}* @defgroup lavu_picture Image related** AVPicture types, pixel formats and basic image planes manipulation.** @{*/enum AVPictureType {AV_PICTURE_TYPE_NONE = 0, ///< UndefinedAV_PICTURE_TYPE_I,     ///< IntraAV_PICTURE_TYPE_P,     ///< PredictedAV_PICTURE_TYPE_B,     ///< Bi-dir predictedAV_PICTURE_TYPE_S,     ///< S(GMC)-VOP MPEG-4AV_PICTURE_TYPE_SI,    ///< Switching IntraAV_PICTURE_TYPE_SP,    ///< Switching PredictedAV_PICTURE_TYPE_BI,    ///< BI type
};

sample_aspect_ratio

寬高比是一個分數，FFMPEG中用AVRational表達分數：


/*** @defgroup lavu_math_rational AVRational* @ingroup lavu_math* Rational number calculation.** While rational numbers can be expressed as floating-point numbers, the* conversion process is a lossy one, so are floating-point operations. On the* other hand, the nature of FFmpeg demands highly accurate calculation of* timestamps. This set of rational number utilities serves as a generic* interface for manipulating rational numbers as pairs of numerators and* denominators.** Many of the functions that operate on AVRational's have the suffix `_q`, in* reference to the mathematical symbol "鈩 (Q) which denotes the set of all* rational numbers.** @{*//*** Rational number (pair of numerator and denominator).*/
typedef struct AVRational{int num; ///< Numeratorint den; ///< Denominator
} AVRational;

qscale_table

QP表指向一塊內存，里面存儲的是每個宏塊的QP值。宏塊的標號是從左往右，一行一行的來的。每個宏塊對應1個QP。
qscale_table[0]就是第1行第1列宏塊的QP值；qscale_table[1]就是第1行第2列宏塊的QP值；qscale_table[2]就是第1行第3列宏塊的QP值。以此類推...
宏塊的個數用下式計算：
注：宏塊大小是16x16的。未找到對應的代碼
- 每行宏塊?數：int mb_stride = pCodecCtx->width/16+1
- 宏塊的總數：int mb_su?m = ((pCodecCtx->height+15)>>4)*(pCodecCtx->width/16+1)

/*** Picture.*/
typedef struct Picture {struct AVFrame *f;ThreadFrame tf;AVBufferRef *qscale_table_buf;int8_t *qscale_table;AVBufferRef *motion_val_buf[2];int16_t (*motion_val[2])[2];AVBufferRef *mb_type_buf;uint32_t *mb_type;          ///< types and macros are defined in mpegutils.hAVBufferRef *mbskip_table_buf;uint8_t *mbskip_table;AVBufferRef *ref_index_buf[2];int8_t *ref_index[2];AVBufferRef *mb_var_buf;uint16_t *mb_var;           ///< Table for MB variancesAVBufferRef *mc_mb_var_buf;uint16_t *mc_mb_var;        ///< Table for motion compensated MB variancesint alloc_mb_width;         ///< mb_width used to allocate tablesint alloc_mb_height;        ///< mb_height used to allocate tablesint alloc_mb_stride;        ///< mb_stride used to allocate tablesAVBufferRef *mb_mean_buf;uint8_t *mb_mean;           ///< Table for MB luminanceAVBufferRef *hwaccel_priv_buf;void *hwaccel_picture_private; ///< Hardware accelerator private dataint field_picture;          ///< whether or not the picture was encoded in separate fieldsint64_t mb_var_sum;         ///< sum of MB variance for current frameint64_t mc_mb_var_sum;      ///< motion compensated MB variance for current frameint b_frame_score;int needs_realloc;          ///< Picture needs to be reallocated (eg due to a frame size change)int reference;int shared;uint64_t encoding_error[MPEGVIDEO_MAX_PLANES];
} Picture;

motion_val

運動矢量表存儲了一幀視頻中的所有運動矢量。
該值的存儲方式比較特別：int16_t (*motion_val[2])[2]

typedef struct ERPicture {AVFrame *f;ThreadFrame *tf;// it is the caller's responsibility to allocate these buffersint16_t (*motion_val[2])[2];int8_t *ref_index[2];uint32_t *mb_type;int field_picture;
} ERPicture;

int mv_sample_log2= 4 - motion_subsample_log2;
int mb_width= (width+15)>>4;
int mv_stride= (mb_width << mv_sample_log2) + 1;
motion_val[direction][x + y*mv_stride][0->mv_x, 1->mv_y];

大概知道了該數據的結構：
- 1.?首先分為兩個列表L0和L1
- 2.每個列表（L0或L1）存儲了一系列的MV（每個MV對應一個畫面，大小由motion_subsample_log2決定）
- 3.每個MV分為橫坐標和縱坐標（x,y）?
注意，在FFMPEG中MV和MB在存儲的結構上是沒有什么關聯的，第1個MV是屏幕上左上角畫面的MV（畫面的大小取決于motion_subsample_log2），第2個MV是屏幕上第1行第2列的畫面的MV，以此類推。因此在一個宏塊（16x16）的運動矢量很有可能如下圖所示（line代表一行運動矢量的個數）：

//例如8x8劃分的運動矢量與宏塊的關系：//-------------------------//|          |            |//|mv[x]     |mv[x+1]     |//-------------------------//|          |	          |//|mv[x+line]|mv[x+line+1]|//-------------------------

mb_type

宏塊類型表存儲了一幀視頻中的所有宏塊的類型。其存儲方式和QP表差不多。只不過其是uint32類型的，而QP表是uint8類型的。每個宏塊對應一個宏塊類型變量。
宏塊類型如下定義所示：

/* MB types */
#define MB_TYPE_INTRA4x4   (1 <<  0)
#define MB_TYPE_INTRA16x16 (1 <<  1) // FIXME H.264-specific
#define MB_TYPE_INTRA_PCM  (1 <<  2) // FIXME H.264-specific
#define MB_TYPE_16x16      (1 <<  3)
#define MB_TYPE_16x8       (1 <<  4)
#define MB_TYPE_8x16       (1 <<  5)
#define MB_TYPE_8x8        (1 <<  6)
#define MB_TYPE_INTERLACED (1 <<  7)
#define MB_TYPE_DIRECT2    (1 <<  8) // FIXME
#define MB_TYPE_ACPRED     (1 <<  9)
#define MB_TYPE_GMC        (1 << 10)
#define MB_TYPE_SKIP       (1 << 11)
#define MB_TYPE_P0L0       (1 << 12)
#define MB_TYPE_P1L0       (1 << 13)
#define MB_TYPE_P0L1       (1 << 14)
#define MB_TYPE_P1L1       (1 << 15)
#define MB_TYPE_L0         (MB_TYPE_P0L0 | MB_TYPE_P1L0)
#define MB_TYPE_L1         (MB_TYPE_P0L1 | MB_TYPE_P1L1)
#define MB_TYPE_L0L1       (MB_TYPE_L0   | MB_TYPE_L1)
#define MB_TYPE_QUANT      (1 << 16)
#define MB_TYPE_CBP        (1 << 17)#define MB_TYPE_INTRA    MB_TYPE_INTRA4x4 // default mb_type if there is just one type

一個宏塊如果包含上述定義中的一種或兩種類型，則其對應的宏塊變量的對應位會被置1。
注：一個宏塊可以包含好幾種類型，但是有些類型是不能重復包含的，比如說一個宏塊不可能既是16x16又是8x8。?

ref_index

運動估計參考幀列表存儲了一幀視頻中所有宏塊的參考幀索引。這個列表其實在比較早的壓縮編碼標準中是沒有什么用的。只有像H.264這樣的編碼標準才有多參考幀的概念。
每個宏塊包含有4個該值，該值反映的是參考幀的索引。

代碼?

typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8/*** pointer to the picture/channel planes.* This might be different from the first allocated byte. For video,* it could even point to the end of the image data.** All pointers in data and extended_data must point into one of the* AVBufferRef in buf or extended_buf.** Some decoders access areas outside 0,0 - width,height, please* see avcodec_align_dimensions2(). Some filters and swscale can read* up to 16 bytes beyond the planes, if these filters are to be used,* then 16 extra bytes must be allocated.** NOTE: Pointers not needed by the format MUST be set to NULL.** @attention In case of video, the data[] pointers can point to the* end of image data in order to reverse line order, when used in* combination with negative values in the linesize[] array.*/uint8_t *data[AV_NUM_DATA_POINTERS];/*** For video, a positive or negative value, which is typically indicating* the size in bytes of each picture line, but it can also be:* - the negative byte size of lines for vertical flipping*   (with data[n] pointing to the end of the data* - a positive or negative multiple of the byte size as for accessing*   even and odd fields of a frame (possibly flipped)** For audio, only linesize[0] may be set. For planar audio, each channel* plane must be the same size.** For video the linesizes should be multiples of the CPUs alignment* preference, this is 16 or 32 for modern desktop CPUs.* Some code requires such alignment other code can be slower without* correct alignment, for yet other it makes no difference.** @note The linesize may be larger than the size of usable data -- there* may be extra padding present for performance reasons.** @attention In case of video, line size values can be negative to achieve* a vertically inverted iteration over image lines.*/int linesize[AV_NUM_DATA_POINTERS];/*** pointers to the data planes/channels.** For video, this should simply point to data[].** For planar audio, each channel has a separate data pointer, and* linesize[0] contains the size of each channel buffer.* For packed audio, there is just one data pointer, and linesize[0]* contains the total size of the buffer for all channels.** Note: Both data and extended_data should always be set in a valid frame,* but for planar audio with more channels that can fit in data,* extended_data must be used in order to access all channels.*/uint8_t **extended_data;/*** @name Video dimensions* Video frames only. The coded dimensions (in pixels) of the video frame,* i.e. the size of the rectangle that contains some well-defined values.** @note The part of the frame intended for display/presentation is further* restricted by the @ref cropping "Cropping rectangle".* @{*/int width, height;/*** @}*//*** number of audio samples (per channel) described by this frame*/int nb_samples;/*** format of the frame, -1 if unknown or unset* Values correspond to enum AVPixelFormat for video frames,* enum AVSampleFormat for audio)*/int format;/*** 1 -> keyframe, 0-> not*/int key_frame;/*** Picture type of the frame.*/enum AVPictureType pict_type;/*** Sample aspect ratio for the video frame, 0/1 if unknown/unspecified.*/AVRational sample_aspect_ratio;/*** Presentation timestamp in time_base units (time when frame should be shown to user).*/int64_t pts;/*** DTS copied from the AVPacket that triggered returning this frame. (if frame threading isn't used)* This is also the Presentation time of this AVFrame calculated from* only AVPacket.dts values without pts values.*/int64_t pkt_dts;/*** Time base for the timestamps in this frame.* In the future, this field may be set on frames output by decoders or* filters, but its value will be by default ignored on input to encoders* or filters.*/AVRational time_base;/*** picture number in bitstream order*/int coded_picture_number;/*** picture number in display order*/int display_picture_number;/*** quality (between 1 (good) and FF_LAMBDA_MAX (bad))*/int quality;/*** for some private data of the user*/void *opaque;/*** When decoding, this signals how much the picture must be delayed.* extra_delay = repeat_pict / (2*fps)*/int repeat_pict;/*** The content of the picture is interlaced.*/int interlaced_frame;/*** If the content is interlaced, is top field displayed first.*/int top_field_first;/*** Tell user application that palette has changed from previous frame.*/int palette_has_changed;/*** reordered opaque 64 bits (generally an integer or a double precision float* PTS but can be anything).* The user sets AVCodecContext.reordered_opaque to represent the input at* that time,* the decoder reorders values as needed and sets AVFrame.reordered_opaque* to exactly one of the values provided by the user through AVCodecContext.reordered_opaque*/int64_t reordered_opaque;/*** Sample rate of the audio data.*/int sample_rate;#if FF_API_OLD_CHANNEL_LAYOUT/*** Channel layout of the audio data.* @deprecated use ch_layout instead*/attribute_deprecateduint64_t channel_layout;
#endif/*** AVBuffer references backing the data for this frame. All the pointers in* data and extended_data must point inside one of the buffers in buf or* extended_buf. This array must be filled contiguously -- if buf[i] is* non-NULL then buf[j] must also be non-NULL for all j < i.** There may be at most one AVBuffer per data plane, so for video this array* always contains all the references. For planar audio with more than* AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in* this array. Then the extra AVBufferRef pointers are stored in the* extended_buf array.*/AVBufferRef *buf[AV_NUM_DATA_POINTERS];/*** For planar audio which requires more than AV_NUM_DATA_POINTERS* AVBufferRef pointers, this array will hold all the references which* cannot fit into AVFrame.buf.** Note that this is different from AVFrame.extended_data, which always* contains all the pointers. This array only contains the extra pointers,* which cannot fit into AVFrame.buf.** This array is always allocated using av_malloc() by whoever constructs* the frame. It is freed in av_frame_unref().*/AVBufferRef **extended_buf;/*** Number of elements in extended_buf.*/int        nb_extended_buf;AVFrameSideData **side_data;int            nb_side_data;/*** @defgroup lavu_frame_flags AV_FRAME_FLAGS* @ingroup lavu_frame* Flags describing additional frame properties.** @{*//*** The frame data may be corrupted, e.g. due to decoding errors.*/
#define AV_FRAME_FLAG_CORRUPT       (1 << 0)
/*** A flag to mark the frames which need to be decoded, but shouldn't be output.*/
#define AV_FRAME_FLAG_DISCARD   (1 << 2)
/*** @}*//*** Frame flags, a combination of @ref lavu_frame_flags*/int flags;/*** MPEG vs JPEG YUV range.* - encoding: Set by user* - decoding: Set by libavcodec*/enum AVColorRange color_range;enum AVColorPrimaries color_primaries;enum AVColorTransferCharacteristic color_trc;/*** YUV colorspace type.* - encoding: Set by user* - decoding: Set by libavcodec*/enum AVColorSpace colorspace;enum AVChromaLocation chroma_location;/*** frame timestamp estimated using various heuristics, in stream time base* - encoding: unused* - decoding: set by libavcodec, read by user.*/int64_t best_effort_timestamp;/*** reordered pos from the last AVPacket that has been input into the decoder* - encoding: unused* - decoding: Read by user.*/int64_t pkt_pos;/*** duration of the corresponding packet, expressed in* AVStream->time_base units, 0 if unknown.* - encoding: unused* - decoding: Read by user.*/int64_t pkt_duration;/*** metadata.* - encoding: Set by user.* - decoding: Set by libavcodec.*/AVDictionary *metadata;/*** decode error flags of the frame, set to a combination of* FF_DECODE_ERROR_xxx flags if the decoder produced a frame, but there* were errors during the decoding.* - encoding: unused* - decoding: set by libavcodec, read by user.*/int decode_error_flags;
#define FF_DECODE_ERROR_INVALID_BITSTREAM   1
#define FF_DECODE_ERROR_MISSING_REFERENCE   2
#define FF_DECODE_ERROR_CONCEALMENT_ACTIVE  4
#define FF_DECODE_ERROR_DECODE_SLICES       8#if FF_API_OLD_CHANNEL_LAYOUT/*** number of audio channels, only used for audio.* - encoding: unused* - decoding: Read by user.* @deprecated use ch_layout instead*/attribute_deprecatedint channels;
#endif/*** size of the corresponding packet containing the compressed* frame.* It is set to a negative value if unknown.* - encoding: unused* - decoding: set by libavcodec, read by user.*/int pkt_size;/*** For hwaccel-format frames, this should be a reference to the* AVHWFramesContext describing the frame.*/AVBufferRef *hw_frames_ctx;/*** AVBufferRef for free use by the API user. FFmpeg will never check the* contents of the buffer ref. FFmpeg calls av_buffer_unref() on it when* the frame is unreferenced. av_frame_copy_props() calls create a new* reference with av_buffer_ref() for the target frame's opaque_ref field.** This is unrelated to the opaque field, although it serves a similar* purpose.*/AVBufferRef *opaque_ref;/*** @anchor cropping* @name Cropping* Video frames only. The number of pixels to discard from the the* top/bottom/left/right border of the frame to obtain the sub-rectangle of* the frame intended for presentation.* @{*/size_t crop_top;size_t crop_bottom;size_t crop_left;size_t crop_right;/*** @}*//*** AVBufferRef for internal use by a single libav* library.* Must not be used to transfer data between libraries.* Has to be NULL when ownership of the frame leaves the respective library.** Code outside the FFmpeg libs should never check or change the contents of the buffer ref.** FFmpeg calls av_buffer_unref() on it when the frame is unreferenced.* av_frame_copy_props() calls create a new reference with av_buffer_ref()* for the target frame's private_ref field.*/AVBufferRef *private_ref;/*** Channel layout of the audio data.*/AVChannelLayout ch_layout;
} AVFrame;