參考鏈接
AVFrame
- AVFrame是包含碼流參數較多的結構體
- 結構體的定義位于frame.h
- AVFrame結構體一般用于存儲原始數據(即非壓縮數據,例如對視頻來說是YUV,RGB,對音頻來說是PCM),此外還包含了一些相關的信息。比如說,解碼的時候存儲了宏塊類型表,QP表,運動矢量表等數據。編碼的時候也存儲了相關的數據。因此在使用FFMPEG進行碼流分析的時候,AVFrame是一個很重要的結構體。
- 重要參數介紹
- uint8_t *data[AV_NUM_DATA_POINTERS]:解碼后原始數據(對視頻來說是YUV,RGB,對音頻來說是PCM)
- int linesize[AV_NUM_DATA_POINTERS]:data中“一行”數據的大小。注意:未必等于圖像的寬,一般大于圖像的寬。
- int width, height:視頻幀寬和高(1920x1080,1280x720...)
- int nb_samples:音頻的一個AVFrame中可能包含多個音頻幀,在此標記包含了幾個
- int format:解碼后原始數據類型(YUV420,YUV422,RGB24...)
- int key_frame:是否是關鍵幀
- enum AVPictureType pict_type:幀類型(I,B,P...)
- AVRational sample_aspect_ratio:寬高比(16:9,4:3...)
- int64_t pts:顯示時間戳
- int coded_picture_number:編碼幀序號
- int display_picture_number:顯示幀序號?
- int interlaced_frame:是否是隔行掃描
變量介紹
data[]
- 對于packed格式的數據(例如RGB24),會存到data[0]里面。
- 對于planar格式的數據(例如YUV420P),則會分開成data[0],data[1],data[2]...(YUV420P中data[0]存Y,data[1]存U,data[2]存V)
- 參考鏈接:FFMPEG 實現 YUV,RGB各種圖像原始數據之間的轉換(swscale)_雷霄驊的博客-CSDN博客_ffmpeg yuv轉rgb
pict_type
- 包含以下類型:
- 文件位于<avutil.h>
/*** @}* @}* @defgroup lavu_picture Image related** AVPicture types, pixel formats and basic image planes manipulation.** @{*/enum AVPictureType {AV_PICTURE_TYPE_NONE = 0, ///< UndefinedAV_PICTURE_TYPE_I, ///< IntraAV_PICTURE_TYPE_P, ///< PredictedAV_PICTURE_TYPE_B, ///< Bi-dir predictedAV_PICTURE_TYPE_S, ///< S(GMC)-VOP MPEG-4AV_PICTURE_TYPE_SI, ///< Switching IntraAV_PICTURE_TYPE_SP, ///< Switching PredictedAV_PICTURE_TYPE_BI, ///< BI type
};
sample_aspect_ratio
- 寬高比是一個分數,FFMPEG中用AVRational表達分數:
/*** @defgroup lavu_math_rational AVRational* @ingroup lavu_math* Rational number calculation.** While rational numbers can be expressed as floating-point numbers, the* conversion process is a lossy one, so are floating-point operations. On the* other hand, the nature of FFmpeg demands highly accurate calculation of* timestamps. This set of rational number utilities serves as a generic* interface for manipulating rational numbers as pairs of numerators and* denominators.** Many of the functions that operate on AVRational's have the suffix `_q`, in* reference to the mathematical symbol "鈩 (Q) which denotes the set of all* rational numbers.** @{*//*** Rational number (pair of numerator and denominator).*/
typedef struct AVRational{int num; ///< Numeratorint den; ///< Denominator
} AVRational;
qscale_table
- QP表指向一塊內存,里面存儲的是每個宏塊的QP值。宏塊的標號是從左往右,一行一行的來的。每個宏塊對應1個QP。
- qscale_table[0]就是第1行第1列宏塊的QP值;qscale_table[1]就是第1行第2列宏塊的QP值;qscale_table[2]就是第1行第3列宏塊的QP值。以此類推...
- 宏塊的個數用下式計算:
- 注:宏塊大小是16x16的。 未找到 對應的代碼
- 每行宏塊?數:int mb_stride = pCodecCtx->width/16+1
- 宏塊的總數:int mb_su?m = ((pCodecCtx->height+15)>>4)*(pCodecCtx->width/16+1)
/*** Picture.*/
typedef struct Picture {struct AVFrame *f;ThreadFrame tf;AVBufferRef *qscale_table_buf;int8_t *qscale_table;AVBufferRef *motion_val_buf[2];int16_t (*motion_val[2])[2];AVBufferRef *mb_type_buf;uint32_t *mb_type; ///< types and macros are defined in mpegutils.hAVBufferRef *mbskip_table_buf;uint8_t *mbskip_table;AVBufferRef *ref_index_buf[2];int8_t *ref_index[2];AVBufferRef *mb_var_buf;uint16_t *mb_var; ///< Table for MB variancesAVBufferRef *mc_mb_var_buf;uint16_t *mc_mb_var; ///< Table for motion compensated MB variancesint alloc_mb_width; ///< mb_width used to allocate tablesint alloc_mb_height; ///< mb_height used to allocate tablesint alloc_mb_stride; ///< mb_stride used to allocate tablesAVBufferRef *mb_mean_buf;uint8_t *mb_mean; ///< Table for MB luminanceAVBufferRef *hwaccel_priv_buf;void *hwaccel_picture_private; ///< Hardware accelerator private dataint field_picture; ///< whether or not the picture was encoded in separate fieldsint64_t mb_var_sum; ///< sum of MB variance for current frameint64_t mc_mb_var_sum; ///< motion compensated MB variance for current frameint b_frame_score;int needs_realloc; ///< Picture needs to be reallocated (eg due to a frame size change)int reference;int shared;uint64_t encoding_error[MPEGVIDEO_MAX_PLANES];
} Picture;
motion_val
- 運動矢量表存儲了一幀視頻中的所有運動矢量。
- 該值的存儲方式比較特別:int16_t (*motion_val[2])[2]
typedef struct ERPicture {AVFrame *f;ThreadFrame *tf;// it is the caller's responsibility to allocate these buffersint16_t (*motion_val[2])[2];int8_t *ref_index[2];uint32_t *mb_type;int field_picture;
} ERPicture;
int mv_sample_log2= 4 - motion_subsample_log2;
int mb_width= (width+15)>>4;
int mv_stride= (mb_width << mv_sample_log2) + 1;
motion_val[direction][x + y*mv_stride][0->mv_x, 1->mv_y];
- 大概知道了該數據的結構:
- 1.?首先分為兩個列表L0和L1
- 2.每個列表(L0或L1)存儲了一系列的MV(每個MV對應一個畫面,大小由motion_subsample_log2決定)
- 3.每個MV分為橫坐標和縱坐標(x,y)?
- 注意,在FFMPEG中MV和MB在存儲的結構上是沒有什么關聯的,第1個MV是屏幕上左上角畫面的MV(畫面的大小取決于motion_subsample_log2),第2個MV是屏幕上第1行第2列的畫面的MV,以此類推。因此在一個宏塊(16x16)的運動矢量很有可能如下圖所示(line代表一行運動矢量的個數):
//例如8x8劃分的運動矢量與宏塊的關系://-------------------------//| | |//|mv[x] |mv[x+1] |//-------------------------//| | |//|mv[x+line]|mv[x+line+1]|//-------------------------
mb_type
- 宏塊類型表存儲了一幀視頻中的所有宏塊的類型。其存儲方式和QP表差不多。只不過其是uint32類型的,而QP表是uint8類型的。每個宏塊對應一個宏塊類型變量。
- 宏塊類型如下定義所示:
/* MB types */
#define MB_TYPE_INTRA4x4 (1 << 0)
#define MB_TYPE_INTRA16x16 (1 << 1) // FIXME H.264-specific
#define MB_TYPE_INTRA_PCM (1 << 2) // FIXME H.264-specific
#define MB_TYPE_16x16 (1 << 3)
#define MB_TYPE_16x8 (1 << 4)
#define MB_TYPE_8x16 (1 << 5)
#define MB_TYPE_8x8 (1 << 6)
#define MB_TYPE_INTERLACED (1 << 7)
#define MB_TYPE_DIRECT2 (1 << 8) // FIXME
#define MB_TYPE_ACPRED (1 << 9)
#define MB_TYPE_GMC (1 << 10)
#define MB_TYPE_SKIP (1 << 11)
#define MB_TYPE_P0L0 (1 << 12)
#define MB_TYPE_P1L0 (1 << 13)
#define MB_TYPE_P0L1 (1 << 14)
#define MB_TYPE_P1L1 (1 << 15)
#define MB_TYPE_L0 (MB_TYPE_P0L0 | MB_TYPE_P1L0)
#define MB_TYPE_L1 (MB_TYPE_P0L1 | MB_TYPE_P1L1)
#define MB_TYPE_L0L1 (MB_TYPE_L0 | MB_TYPE_L1)
#define MB_TYPE_QUANT (1 << 16)
#define MB_TYPE_CBP (1 << 17)#define MB_TYPE_INTRA MB_TYPE_INTRA4x4 // default mb_type if there is just one type
- 一個宏塊如果包含上述定義中的一種或兩種類型,則其對應的宏塊變量的對應位會被置1。
- 注:一個宏塊可以包含好幾種類型,但是有些類型是不能重復包含的,比如說一個宏塊不可能既是16x16又是8x8。?
ref_index
- 運動估計參考幀列表存儲了一幀視頻中所有宏塊的參考幀索引。這個列表其實在比較早的壓縮編碼標準中是沒有什么用的。只有像H.264這樣的編碼標準才有多參考幀的概念。
- 每個宏塊包含有4個該值,該值反映的是參考幀的索引。
代碼?
typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8/*** pointer to the picture/channel planes.* This might be different from the first allocated byte. For video,* it could even point to the end of the image data.** All pointers in data and extended_data must point into one of the* AVBufferRef in buf or extended_buf.** Some decoders access areas outside 0,0 - width,height, please* see avcodec_align_dimensions2(). Some filters and swscale can read* up to 16 bytes beyond the planes, if these filters are to be used,* then 16 extra bytes must be allocated.** NOTE: Pointers not needed by the format MUST be set to NULL.** @attention In case of video, the data[] pointers can point to the* end of image data in order to reverse line order, when used in* combination with negative values in the linesize[] array.*/uint8_t *data[AV_NUM_DATA_POINTERS];/*** For video, a positive or negative value, which is typically indicating* the size in bytes of each picture line, but it can also be:* - the negative byte size of lines for vertical flipping* (with data[n] pointing to the end of the data* - a positive or negative multiple of the byte size as for accessing* even and odd fields of a frame (possibly flipped)** For audio, only linesize[0] may be set. For planar audio, each channel* plane must be the same size.** For video the linesizes should be multiples of the CPUs alignment* preference, this is 16 or 32 for modern desktop CPUs.* Some code requires such alignment other code can be slower without* correct alignment, for yet other it makes no difference.** @note The linesize may be larger than the size of usable data -- there* may be extra padding present for performance reasons.** @attention In case of video, line size values can be negative to achieve* a vertically inverted iteration over image lines.*/int linesize[AV_NUM_DATA_POINTERS];/*** pointers to the data planes/channels.** For video, this should simply point to data[].** For planar audio, each channel has a separate data pointer, and* linesize[0] contains the size of each channel buffer.* For packed audio, there is just one data pointer, and linesize[0]* contains the total size of the buffer for all channels.** Note: Both data and extended_data should always be set in a valid frame,* but for planar audio with more channels that can fit in data,* extended_data must be used in order to access all channels.*/uint8_t **extended_data;/*** @name Video dimensions* Video frames only. The coded dimensions (in pixels) of the video frame,* i.e. the size of the rectangle that contains some well-defined values.** @note The part of the frame intended for display/presentation is further* restricted by the @ref cropping "Cropping rectangle".* @{*/int width, height;/*** @}*//*** number of audio samples (per channel) described by this frame*/int nb_samples;/*** format of the frame, -1 if unknown or unset* Values correspond to enum AVPixelFormat for video frames,* enum AVSampleFormat for audio)*/int format;/*** 1 -> keyframe, 0-> not*/int key_frame;/*** Picture type of the frame.*/enum AVPictureType pict_type;/*** Sample aspect ratio for the video frame, 0/1 if unknown/unspecified.*/AVRational sample_aspect_ratio;/*** Presentation timestamp in time_base units (time when frame should be shown to user).*/int64_t pts;/*** DTS copied from the AVPacket that triggered returning this frame. (if frame threading isn't used)* This is also the Presentation time of this AVFrame calculated from* only AVPacket.dts values without pts values.*/int64_t pkt_dts;/*** Time base for the timestamps in this frame.* In the future, this field may be set on frames output by decoders or* filters, but its value will be by default ignored on input to encoders* or filters.*/AVRational time_base;/*** picture number in bitstream order*/int coded_picture_number;/*** picture number in display order*/int display_picture_number;/*** quality (between 1 (good) and FF_LAMBDA_MAX (bad))*/int quality;/*** for some private data of the user*/void *opaque;/*** When decoding, this signals how much the picture must be delayed.* extra_delay = repeat_pict / (2*fps)*/int repeat_pict;/*** The content of the picture is interlaced.*/int interlaced_frame;/*** If the content is interlaced, is top field displayed first.*/int top_field_first;/*** Tell user application that palette has changed from previous frame.*/int palette_has_changed;/*** reordered opaque 64 bits (generally an integer or a double precision float* PTS but can be anything).* The user sets AVCodecContext.reordered_opaque to represent the input at* that time,* the decoder reorders values as needed and sets AVFrame.reordered_opaque* to exactly one of the values provided by the user through AVCodecContext.reordered_opaque*/int64_t reordered_opaque;/*** Sample rate of the audio data.*/int sample_rate;#if FF_API_OLD_CHANNEL_LAYOUT/*** Channel layout of the audio data.* @deprecated use ch_layout instead*/attribute_deprecateduint64_t channel_layout;
#endif/*** AVBuffer references backing the data for this frame. All the pointers in* data and extended_data must point inside one of the buffers in buf or* extended_buf. This array must be filled contiguously -- if buf[i] is* non-NULL then buf[j] must also be non-NULL for all j < i.** There may be at most one AVBuffer per data plane, so for video this array* always contains all the references. For planar audio with more than* AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in* this array. Then the extra AVBufferRef pointers are stored in the* extended_buf array.*/AVBufferRef *buf[AV_NUM_DATA_POINTERS];/*** For planar audio which requires more than AV_NUM_DATA_POINTERS* AVBufferRef pointers, this array will hold all the references which* cannot fit into AVFrame.buf.** Note that this is different from AVFrame.extended_data, which always* contains all the pointers. This array only contains the extra pointers,* which cannot fit into AVFrame.buf.** This array is always allocated using av_malloc() by whoever constructs* the frame. It is freed in av_frame_unref().*/AVBufferRef **extended_buf;/*** Number of elements in extended_buf.*/int nb_extended_buf;AVFrameSideData **side_data;int nb_side_data;/*** @defgroup lavu_frame_flags AV_FRAME_FLAGS* @ingroup lavu_frame* Flags describing additional frame properties.** @{*//*** The frame data may be corrupted, e.g. due to decoding errors.*/
#define AV_FRAME_FLAG_CORRUPT (1 << 0)
/*** A flag to mark the frames which need to be decoded, but shouldn't be output.*/
#define AV_FRAME_FLAG_DISCARD (1 << 2)
/*** @}*//*** Frame flags, a combination of @ref lavu_frame_flags*/int flags;/*** MPEG vs JPEG YUV range.* - encoding: Set by user* - decoding: Set by libavcodec*/enum AVColorRange color_range;enum AVColorPrimaries color_primaries;enum AVColorTransferCharacteristic color_trc;/*** YUV colorspace type.* - encoding: Set by user* - decoding: Set by libavcodec*/enum AVColorSpace colorspace;enum AVChromaLocation chroma_location;/*** frame timestamp estimated using various heuristics, in stream time base* - encoding: unused* - decoding: set by libavcodec, read by user.*/int64_t best_effort_timestamp;/*** reordered pos from the last AVPacket that has been input into the decoder* - encoding: unused* - decoding: Read by user.*/int64_t pkt_pos;/*** duration of the corresponding packet, expressed in* AVStream->time_base units, 0 if unknown.* - encoding: unused* - decoding: Read by user.*/int64_t pkt_duration;/*** metadata.* - encoding: Set by user.* - decoding: Set by libavcodec.*/AVDictionary *metadata;/*** decode error flags of the frame, set to a combination of* FF_DECODE_ERROR_xxx flags if the decoder produced a frame, but there* were errors during the decoding.* - encoding: unused* - decoding: set by libavcodec, read by user.*/int decode_error_flags;
#define FF_DECODE_ERROR_INVALID_BITSTREAM 1
#define FF_DECODE_ERROR_MISSING_REFERENCE 2
#define FF_DECODE_ERROR_CONCEALMENT_ACTIVE 4
#define FF_DECODE_ERROR_DECODE_SLICES 8#if FF_API_OLD_CHANNEL_LAYOUT/*** number of audio channels, only used for audio.* - encoding: unused* - decoding: Read by user.* @deprecated use ch_layout instead*/attribute_deprecatedint channels;
#endif/*** size of the corresponding packet containing the compressed* frame.* It is set to a negative value if unknown.* - encoding: unused* - decoding: set by libavcodec, read by user.*/int pkt_size;/*** For hwaccel-format frames, this should be a reference to the* AVHWFramesContext describing the frame.*/AVBufferRef *hw_frames_ctx;/*** AVBufferRef for free use by the API user. FFmpeg will never check the* contents of the buffer ref. FFmpeg calls av_buffer_unref() on it when* the frame is unreferenced. av_frame_copy_props() calls create a new* reference with av_buffer_ref() for the target frame's opaque_ref field.** This is unrelated to the opaque field, although it serves a similar* purpose.*/AVBufferRef *opaque_ref;/*** @anchor cropping* @name Cropping* Video frames only. The number of pixels to discard from the the* top/bottom/left/right border of the frame to obtain the sub-rectangle of* the frame intended for presentation.* @{*/size_t crop_top;size_t crop_bottom;size_t crop_left;size_t crop_right;/*** @}*//*** AVBufferRef for internal use by a single libav* library.* Must not be used to transfer data between libraries.* Has to be NULL when ownership of the frame leaves the respective library.** Code outside the FFmpeg libs should never check or change the contents of the buffer ref.** FFmpeg calls av_buffer_unref() on it when the frame is unreferenced.* av_frame_copy_props() calls create a new reference with av_buffer_ref()* for the target frame's private_ref field.*/AVBufferRef *private_ref;/*** Channel layout of the audio data.*/AVChannelLayout ch_layout;
} AVFrame;