VideoCore 源码学习-一一网

简介

VideoCore是一个开源的视频处理库，支持采集、合成、编码和RTMP推流。
地址：github.com/jgh-/VideoC…

代码结构

VideoCore的处理流程如下

Source (Camera) -> Transform (Composite) -> Transform (H.264 Encode) -> Transform (RTMP Packetize) -> Output (RTMP)

VideoCore的代码结构

videocore/
sources/
videocore::ISource
videocore::IAudioSource : videocore::ISource
videocore::IVideoSource : videocore::ISource
videocore::Watermark : videocore:IVideoSource
iOS/
videocore::iOS::CameraSource : videocore::IVideoSource
Apple/
videocore::Apple::MicSource : videocore::IAudioSource
OSX/
videocore::OSX::DisplaySource : videocore::IVideoSource
videocore::OSX::SystemAudioSource : videocore::IAudioSource
outputs/
videocore::IOutput
videocore::ITransform : videocore::IOutput
iOS/
videocore::iOS::H264Transform : videocore::ITransform
videocore::iOS::AACTransform  : videocore::ITransform
OSX/
videocore::OSX::H264Transform : videocore::ITransform
videocore::OSX::AACTransform  : videocore::ITransform
RTMP/
videocore::rtmp::H264Packetizer : videocore::ITransform
videocore::rtmp::AACPacketizer : videocore::ITransform
mixers/
videocore::IMixer
videocore::IAudioMixer : videocore::IMixer
videocore::IVideoMixer : videocore::IMixer
videocore::AudioMixer : videocore::IAudioMixer
iOS/
videocore::iOS::GLESVideoMixer : videocore::IVideoMixer
OSX/
videocore::OSX::GLVideoMixer : videocore::IVideoMixer
rtmp/
videocore::RTMPSession : videocore::IOutput
stream/
videocore::IStreamSession
Apple/
videocore::Apple::StreamSession : videocore::IStreamSession
复制代码

在例子中有一个VCSimpleSession类，里边有对videocore的使用例子，初始化后调用setupGraph，其大致的逻辑如下：

视频处理流程设置
CameraSource—> AspectTransform—> PositionTransform—> GLESVideoMixer—> Split —> PixelBufferOutput

音频处理流程设置
MicSource—> AudioMixer

使用VideoCore采集音视频的情况的处理流程大致如上述，启用摄像头和麦克风采集到音视频数据后调用pushBuffer将数据传到下一级作进一步处理。开始推流后，再次添加处理链，调用addEncodersAndPacketizers方法
rtmp推流之前，需要对音视频数据进行编码，然后分片打包，将打包后的数据放入推流队列中。

GLESVideoMixer—> H264EncodeApple—> Split—> H264Packetizer—> RTMPSession

AudioMixer—> AACEncode—> Split—> AACPacketizer—> RTMPSession

Split类只负责将数据转发到下一个节点，没有做其他的处理。

IOutput基类

IOutput类作为输出基类，定义了两个虚函数，其中pushBuffer作为将数据传到下一个节点的入口。

class IOutput
    {
    public:
    
        virtual void setEpoch(const std::chrono::steady_clock::time_point epoch) {};
        
        virtual void pushBuffer(const uint8_t* const data, size_t size, IMetadata& metadata) = 0;
        
        virtual ~IOutput() {};

    };
复制代码

视频采集

视频采集类CameraSource，调用IOS系统库AVFoundation，核心代码在setupCamera方法中。

在AVCaptureVideoDataOutput代理回调方法中，将采集到的数据

- (void)captureOutput:(AVCaptureOutput *)captureOutput

didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer

       fromConnection:(AVCaptureConnection *)connection
{
    auto source = m_source.lock();
    if(source) {
        source->bufferCaptured(CMSampleBufferGetImageBuffer(sampleBuffer));
    }
}
复制代码

这里的source是CameraSource

    CameraSource::bufferCaptured(CVPixelBufferRef pixelBufferRef)

    {
        auto output = m_output.lock();
        if(output) {
            VideoBufferMetadata md(1.f / float(m_fps));
            md.setData(1, m_matrix, false, shared_from_this());
            auto pixelBuffer = std::make_shared<Apple::PixelBuffer>(pixelBufferRef, true);
            pixelBuffer->setState(kVCPixelBufferStateEnqueued);
            output->pushBuffer((uint8_t*)&pixelBuffer, sizeof(pixelBuffer), md);
        }
    }
复制代码

在bufferCaptured方法中，将CVPixelBufferRef封装成PixelBuffer，并设置状态为kVCPixelBufferStateEnqueued，然后将PixelBuffer传给下一个节点（如：CameraSource—> AspectTransform—> PositionTransform —> GLESVideoMixer ）。

AspectTransform 和PositionTransform主要是对视频做一些调整，比如平移、缩放

GLESVideoMixer 主要渲染视频到纹理中，再将渲染后的数据传给下一个结点（H264EncodeApple）。

音频采集

音频采集在MicSource类中，使用系统库AudioToolbox，核心代码在构造器函数MicSource()中。在音频采集回调方法handleInputBuffer中，取到采集的数据，将数据封装成AudioBuffer并添加相关参数，然后传给下一个节点(AudioMixer)。

static OSStatus handleInputBuffer(void *inRefCon,
                                  AudioUnitRenderActionFlags *ioActionFlags,
                                  const AudioTimeStamp *inTimeStamp,
                                  UInt32 inBusNumber,
                                  UInt32 inNumberFrames,
                                  AudioBufferList *ioData)

{
    videocore::iOS::MicSource* mc =static_cast<videocore::iOS::MicSource*>(inRefCon);
    AudioBuffer buffer;
    buffer.mData = NULL;
    buffer.mDataByteSize = 0;
    buffer.mNumberChannels = 2;
    AudioBufferList buffers;
    buffers.mNumberBuffers = 1;
    buffers.mBuffers[0] = buffer;
    OSStatus status = AudioUnitRender(mc->audioUnit(),
                                      ioActionFlags,
                                      inTimeStamp,
                                      inBusNumber,
                                      inNumberFrames,
                                      &buffers);
    if(!status) {
        mc->inputCallback((uint8_t*)buffers.mBuffers[0].mData, buffers.mBuffers[0].mDataByteSize, inNumberFrames);
    }
    return status;

}
MicSource::inputCallback(uint8_t *data, size_t data_size, **int** inNumberFrames)

    {
        auto output = m_output.lock();
        if(output) {
            videocore::AudioBufferMetadata md (0.);
            md.setData(m_sampleRate,
                       16,
                       m_channelCount,
                       kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked,
                       m_channelCount * 2,
                       inNumberFrames,
                       false,
                       false,
                       shared_from_this());
            output->pushBuffer(data, data_size, md);
        }
    }
复制代码

AudioMixer收到数据后，将音频数据进行重采样，放入待编码队列中，这里的队列是一个链表结构。AudioMixer调用start()后，从队列中的音频数据传到下一个节点(AACEncode);

视频编码

接收到上一级的数据后，H264EncodeApple将数据进行H264编码压缩

H264EncodeApple::pushBuffer(const uint8_t *const data, size_t size, videocore::IMetadata &metadata)
    {
#if VERSION_OK
        if(m_compressionSession) {
            m_encodeMutex.lock();
            VTCompressionSessionRef session = (VTCompressionSessionRef)m_compressionSession;
            CMTime pts = CMTimeMake(metadata.timestampDelta + m_ctsOffset, 1000.); // timestamp is in ms.
            CMTime dur = CMTimeMake(1, m_fps);
            VTEncodeInfoFlags flags;
            CFMutableDictionaryRef frameProps = NULL;
            if(m_forceKeyframe) {
                s_forcedKeyframePTS = pts.value;
                frameProps = CFDictionaryCreateMutable(kCFAllocatorDefault, 1,&kCFTypeDictionaryKeyCallBacks,                                                            &kCFTypeDictionaryValueCallBacks);
                CFDictionaryAddValue(frameProps, kVTEncodeFrameOptionKey_ForceKeyFrame, kCFBooleanTrue);
            }
            VTCompressionSessionEncodeFrame(session, (CVPixelBufferRef)data, pts, dur, frameProps, NULL, &flags);
            if(m_forceKeyframe) {
                CFRelease(frameProps);
                m_forceKeyframe = **false**;
            }
            m_encodeMutex.unlock();
        }
#endif
    }
复制代码

这里使用IOS系统提供的硬编码,在编码回调中，将编码后数据传到下一个节点

void vtCallback(void *outputCallbackRefCon,
                    void *sourceFrameRefCon,
                    OSStatus status,
                    VTEncodeInfoFlags infoFlags,
                    CMSampleBufferRef sampleBuffer )
    {
        CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
        CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);
        CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
        CMTime dts = CMSampleBufferGetDecodeTimeStamp(sampleBuffer);
        //printf("status: %d\n", (int) status);
        bool isKeyframe = **false**;
        if(attachments != **NULL**) {
            CFDictionaryRef attachment;
            CFBooleanRef dependsOnOthers;
            attachment = (CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
            dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers);
            isKeyframe = (dependsOnOthers == kCFBooleanFalse);
        }
        if(isKeyframe) {
            // Send the SPS and PPS.
            CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
            size_t spsSize, ppsSize;
            size_t parmCount;
            const uint8_t* sps, *pps;
            CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, nullptr);
            CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, nullptr );
            std::unique_ptr<uint8_t[]> sps_buf (new uint8_t[spsSize + 4]) ;
            std::unique_ptr<uint8_t[]> pps_buf (new uint8_t[ppsSize + 4]) ;
            memcpy(&sps_buf[4], sps, spsSize);
            spsSize+=4 ;
            memcpy(&sps_buf[0], &spsSize, 4);
            memcpy(&pps_buf[4], pps, ppsSize);
            ppsSize += 4;
            memcpy(&pps_buf[0], &ppsSize, 4);
            ((H264EncodeApple*)outputCallbackRefCon)->compressionSessionOutput((uint8_t*)sps_buf.get(),spsSize, pts.value, dts.value);

            ((H264EncodeApple*)outputCallbackRefCon)->compressionSessionOutput((uint8_t*)pps_buf.get(),ppsSize, pts.value, dts.value);
        }
        char* bufferData;
        size_t size;
        CMBlockBufferGetDataPointer(block, 0, **NULL**, &size, &bufferData);
        ((H264EncodeApple*)outputCallbackRefCon)->compressionSessionOutput((uint8_t*)bufferData,size, pts.value, dts.value);
    }

H264EncodeApple::compressionSessionOutput(const uint8_t *data, size_t size, uint64_t pts, uint64_t dts)
    {
#if VERSION_OK
        auto l = m_output.lock();
        if(l && data && size > 0) {
            videocore::VideoBufferMetadata md(pts, dts);
            l->pushBuffer(data, size, md);
        }
#endif
    }
复制代码

音频编码

AACEncode收到音频数据之后对数据进行编码，VideoCore使用的是IOS系统库AudioToolbox库进行AAC转码。

AACEncode::ioProc(AudioConverterRef audioConverter, UInt32 *ioNumDataPackets, AudioBufferList* ioData, AudioStreamPacketDescription** ioPacketDesc, void* inUserData )
    {
        UserData* ud = static_cast<UserData*>(inUserData);
        UInt32 maxPackets = ud->size / ud->packetSize;
        *ioNumDataPackets = std::min(maxPackets, *ioNumDataPackets);
        ioData->mBuffers[0].mData = ud->data;
        ioData->mBuffers[0].mDataByteSize = ud->size;
        ioData->mBuffers[0].mNumberChannels = 1;
        return noErr;
    }
AACEncode::pushBuffer(const uint8_t* const data, size_t size, IMetadata& metadata)
    {
        const size_t sampleCount = size / m_bytesPerSample;
        const size_t aac_packet_count = sampleCount / kSamplesPerFrame;
        const size_t required_bytes = aac_packet_count * m_outputPacketMaxSize;
        if(m_outputBuffer.total() < (required_bytes)) {
            m_outputBuffer.resize(required_bytes);
        }
        uint8_t* p = m_outputBuffer();
        uint8_t* p_out = (uint8_t*)data;
        for ( size_t i = 0 ; i < aac_packet_count ; ++i ) {
            UInt32 num_packets = 1;
            AudioBufferList l;
            l.mNumberBuffers=1;
            l.mBuffers[0].mDataByteSize = m_outputPacketMaxSize * num_packets;
            l.mBuffers[0].mData = p;
            std::unique_ptr<UserData> ud(new UserData());
            ud->size = static_cast<int>(kSamplesPerFrame * m_bytesPerSample);
            ud->data = const_cast<uint8_t*>(p_out);
            ud->packetSize = static_cast<int>(m_bytesPerSample);
            AudioStreamPacketDescription output_packet_desc[num_packets];
            m_converterMutex.lock();
            AudioConverterFillComplexBuffer(m_audioConverter, AACEncode::ioProc, ud.get(), &num_packets, &l, output_packet_desc);
            m_converterMutex.unlock();
            p += output_packet_desc[0].mDataByteSize;
            p_out += kSamplesPerFrame * m_bytesPerSample;
        }
        const size_t totalBytes = p - m_outputBuffer();
        auto output = m_output.lock();
        if(output && totalBytes) {
            if(!m_sentConfig) {
                output->pushBuffer((const uint8_t*)m_asc, sizeof(m_asc), metadata);
                m_sentConfig = true;
            }
            output->pushBuffer(m_outputBuffer(), totalBytes, metadata);
        }
    }
复制代码

将音频数据编码后，传入下一个节点(Split—> AACPacketizer)。

推流

推流的逻辑在最后一个节点中处理(RTMPSession)，关键代码如下：

RTMPSession::pushBuffer(const uint8_t* const data, size_t size, IMetadata& metadata)
    {
        if(m_ending) {
            return ;
        }
        // make the lamdba capture the data
        std::shared_ptr<Buffer> buf = std::make_shared<Buffer>(size);
        buf->put(const_cast<uint8_t*>(data), size);
        const RTMPMetadata_t inMetadata = static_cast<const RTMPMetadata_t&>(metadata);
        m_jobQueue.enqueue([=]() {
            if(!this->m_ending) {
                static int c_count = 0;
                c_count ++;
                auto packetTime = std::chrono::steady_clock::now();
                std::vector<uint8_t> chunk;
                chunk.reserve(size+64);
                size_t len = buf->size();
                size_t tosend = std::min(len, m_outChunkSize);
                uint8_t* p;
                buf->read(&p, buf->size());
                uint64_t ts = inMetadata.getData<kRTMPMetadataTimestamp>() ;
                const int streamId = inMetadata.getData<kRTMPMetadataMsgStreamId>();
#ifndef RTMP_CHUNK_TYPE_0_ONLY
                auto it = m_previousChunkData.find(streamId);
                if(it == m_previousChunkData.end()) {
#endif
                    // Type 0.
                    put_byte(chunk, ( streamId & 0x1F));
                    put_be24(chunk, static_cast<uint32_t>(ts));
                    put_be24(chunk, inMetadata.getData<kRTMPMetadataMsgLength>());
                    put_byte(chunk, inMetadata.getData<kRTMPMetadataMsgTypeId>());
                    put_buff(chunk, (uint8_t*)&m_streamId, sizeof(int32_t)); // msg stream id is little-endian
#ifndef RTMP_CHUNK_TYPE_0_ONLY
                } else {
                    // Type 1.
                    put_byte(chunk, RTMP_CHUNK_TYPE_1 | (streamId & 0x1F));
                    put_be24(chunk, static_cast<uint32_t>(ts - it->second)); // timestamp delta
                    put_be24(chunk, inMetadata.getData<kRTMPMetadataMsgLength>());
                    put_byte(chunk, inMetadata.getData<kRTMPMetadataMsgTypeId>());
                }
#endif
                m_previousChunkData[streamId] = ts;
                put_buff(chunk, p, tosend);
                len -= tosend;
                p += tosend;
                while(len > 0) {
                    tosend = std::min(len, m_outChunkSize);
                    p[-1] = RTMP_CHUNK_TYPE_3 | (streamId & 0x1F);
                    put_buff(chunk, p-1, tosend+1);
                    p+=tosend;
                    len-=tosend;
                }
                this->write(&chunk[0], chunk.size(), packetTime, inMetadata.getData<kRTMPMetadataIsKeyframe>() );
            }
        });
    }
复制代码

将数据打包成RMPT协议的chunk后进行推流。

文章版权归作者所有，未经允许请勿转载。

THE END

IOS