简介
VideoCore是一个开源的视频处理库,支持采集、合成、编码和RTMP推流。
地址:github.com/jgh-/VideoC…
代码结构
VideoCore的处理流程如下
Source (Camera) -> Transform (Composite) -> Transform (H.264 Encode) -> Transform (RTMP Packetize) -> Output (RTMP)
VideoCore的代码结构
videocore/
sources/
videocore::ISource
videocore::IAudioSource : videocore::ISource
videocore::IVideoSource : videocore::ISource
videocore::Watermark : videocore:IVideoSource
iOS/
videocore::iOS::CameraSource : videocore::IVideoSource
Apple/
videocore::Apple::MicSource : videocore::IAudioSource
OSX/
videocore::OSX::DisplaySource : videocore::IVideoSource
videocore::OSX::SystemAudioSource : videocore::IAudioSource
outputs/
videocore::IOutput
videocore::ITransform : videocore::IOutput
iOS/
videocore::iOS::H264Transform : videocore::ITransform
videocore::iOS::AACTransform : videocore::ITransform
OSX/
videocore::OSX::H264Transform : videocore::ITransform
videocore::OSX::AACTransform : videocore::ITransform
RTMP/
videocore::rtmp::H264Packetizer : videocore::ITransform
videocore::rtmp::AACPacketizer : videocore::ITransform
mixers/
videocore::IMixer
videocore::IAudioMixer : videocore::IMixer
videocore::IVideoMixer : videocore::IMixer
videocore::AudioMixer : videocore::IAudioMixer
iOS/
videocore::iOS::GLESVideoMixer : videocore::IVideoMixer
OSX/
videocore::OSX::GLVideoMixer : videocore::IVideoMixer
rtmp/
videocore::RTMPSession : videocore::IOutput
stream/
videocore::IStreamSession
Apple/
videocore::Apple::StreamSession : videocore::IStreamSession
复制代码
在例子中有一个VCSimpleSession类,里边有对videocore的使用例子,初始化后调用setupGraph,其大致的逻辑如下:
视频处理流程设置
CameraSource—> AspectTransform—> PositionTransform—> GLESVideoMixer—> Split —> PixelBufferOutput
音频处理流程设置
MicSource—> AudioMixer
使用VideoCore采集音视频的情况的处理流程大致如上述,启用摄像头和麦克风采集到音视频数据后调用pushBuffer将数据传到下一级作进一步处理。开始推流后,再次添加处理链,调用addEncodersAndPacketizers方法
rtmp推流之前,需要对音视频数据进行编码,然后分片打包,将打包后的数据放入推流队列中。
GLESVideoMixer—> H264EncodeApple—> Split—> H264Packetizer—> RTMPSession
AudioMixer—> AACEncode—> Split—> AACPacketizer—> RTMPSession
Split类只负责将数据转发到下一个节点,没有做其他的处理。
IOutput基类
IOutput类作为输出基类,定义了两个虚函数,其中pushBuffer作为将数据传到下一个节点的入口。
class IOutput
{
public:
virtual void setEpoch(const std::chrono::steady_clock::time_point epoch) {};
virtual void pushBuffer(const uint8_t* const data, size_t size, IMetadata& metadata) = 0;
virtual ~IOutput() {};
};
复制代码
视频采集
视频采集类CameraSource,调用IOS系统库AVFoundation,核心代码在setupCamera方法中。
在AVCaptureVideoDataOutput代理回调方法中,将采集到的数据
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
auto source = m_source.lock();
if(source) {
source->bufferCaptured(CMSampleBufferGetImageBuffer(sampleBuffer));
}
}
复制代码
这里的source是CameraSource
CameraSource::bufferCaptured(CVPixelBufferRef pixelBufferRef)
{
auto output = m_output.lock();
if(output) {
VideoBufferMetadata md(1.f / float(m_fps));
md.setData(1, m_matrix, false, shared_from_this());
auto pixelBuffer = std::make_shared<Apple::PixelBuffer>(pixelBufferRef, true);
pixelBuffer->setState(kVCPixelBufferStateEnqueued);
output->pushBuffer((uint8_t*)&pixelBuffer, sizeof(pixelBuffer), md);
}
}
复制代码
在bufferCaptured方法中,将CVPixelBufferRef封装成PixelBuffer,并设置状态为kVCPixelBufferStateEnqueued,然后将PixelBuffer传给下一个节点(如:CameraSource—> AspectTransform—> PositionTransform —> GLESVideoMixer )。
AspectTransform 和PositionTransform主要是对视频做一些调整,比如平移、缩放
GLESVideoMixer 主要渲染视频到纹理中,再将渲染后的数据传给下一个结点(H264EncodeApple)。
音频采集
音频采集在MicSource类中,使用系统库AudioToolbox,核心代码在构造器函数MicSource()中。在音频采集回调方法handleInputBuffer中,取到采集的数据,将数据封装成AudioBuffer并添加相关参数,然后传给下一个节点(AudioMixer)。
static OSStatus handleInputBuffer(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
videocore::iOS::MicSource* mc =static_cast<videocore::iOS::MicSource*>(inRefCon);
AudioBuffer buffer;
buffer.mData = NULL;
buffer.mDataByteSize = 0;
buffer.mNumberChannels = 2;
AudioBufferList buffers;
buffers.mNumberBuffers = 1;
buffers.mBuffers[0] = buffer;
OSStatus status = AudioUnitRender(mc->audioUnit(),
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
&buffers);
if(!status) {
mc->inputCallback((uint8_t*)buffers.mBuffers[0].mData, buffers.mBuffers[0].mDataByteSize, inNumberFrames);
}
return status;
}
MicSource::inputCallback(uint8_t *data, size_t data_size, **int** inNumberFrames)
{
auto output = m_output.lock();
if(output) {
videocore::AudioBufferMetadata md (0.);
md.setData(m_sampleRate,
16,
m_channelCount,
kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked,
m_channelCount * 2,
inNumberFrames,
false,
false,
shared_from_this());
output->pushBuffer(data, data_size, md);
}
}
复制代码
AudioMixer收到数据后,将音频数据进行重采样,放入待编码队列中,这里的队列是一个链表结构。AudioMixer调用start()后,从队列中的音频数据传到下一个节点(AACEncode);
视频编码
接收到上一级的数据后,H264EncodeApple将数据进行H264编码压缩
H264EncodeApple::pushBuffer(const uint8_t *const data, size_t size, videocore::IMetadata &metadata)
{
#if VERSION_OK
if(m_compressionSession) {
m_encodeMutex.lock();
VTCompressionSessionRef session = (VTCompressionSessionRef)m_compressionSession;
CMTime pts = CMTimeMake(metadata.timestampDelta + m_ctsOffset, 1000.); // timestamp is in ms.
CMTime dur = CMTimeMake(1, m_fps);
VTEncodeInfoFlags flags;
CFMutableDictionaryRef frameProps = NULL;
if(m_forceKeyframe) {
s_forcedKeyframePTS = pts.value;
frameProps = CFDictionaryCreateMutable(kCFAllocatorDefault, 1,&kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
CFDictionaryAddValue(frameProps, kVTEncodeFrameOptionKey_ForceKeyFrame, kCFBooleanTrue);
}
VTCompressionSessionEncodeFrame(session, (CVPixelBufferRef)data, pts, dur, frameProps, NULL, &flags);
if(m_forceKeyframe) {
CFRelease(frameProps);
m_forceKeyframe = **false**;
}
m_encodeMutex.unlock();
}
#endif
}
复制代码
这里使用IOS系统提供的硬编码,在编码回调中,将编码后数据传到下一个节点
void vtCallback(void *outputCallbackRefCon,
void *sourceFrameRefCon,
OSStatus status,
VTEncodeInfoFlags infoFlags,
CMSampleBufferRef sampleBuffer )
{
CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);
CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
CMTime dts = CMSampleBufferGetDecodeTimeStamp(sampleBuffer);
//printf("status: %d\n", (int) status);
bool isKeyframe = **false**;
if(attachments != **NULL**) {
CFDictionaryRef attachment;
CFBooleanRef dependsOnOthers;
attachment = (CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers);
isKeyframe = (dependsOnOthers == kCFBooleanFalse);
}
if(isKeyframe) {
// Send the SPS and PPS.
CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
size_t spsSize, ppsSize;
size_t parmCount;
const uint8_t* sps, *pps;
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, nullptr);
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, nullptr );
std::unique_ptr<uint8_t[]> sps_buf (new uint8_t[spsSize + 4]) ;
std::unique_ptr<uint8_t[]> pps_buf (new uint8_t[ppsSize + 4]) ;
memcpy(&sps_buf[4], sps, spsSize);
spsSize+=4 ;
memcpy(&sps_buf[0], &spsSize, 4);
memcpy(&pps_buf[4], pps, ppsSize);
ppsSize += 4;
memcpy(&pps_buf[0], &ppsSize, 4);
((H264EncodeApple*)outputCallbackRefCon)->compressionSessionOutput((uint8_t*)sps_buf.get(),spsSize, pts.value, dts.value);
((H264EncodeApple*)outputCallbackRefCon)->compressionSessionOutput((uint8_t*)pps_buf.get(),ppsSize, pts.value, dts.value);
}
char* bufferData;
size_t size;
CMBlockBufferGetDataPointer(block, 0, **NULL**, &size, &bufferData);
((H264EncodeApple*)outputCallbackRefCon)->compressionSessionOutput((uint8_t*)bufferData,size, pts.value, dts.value);
}
H264EncodeApple::compressionSessionOutput(const uint8_t *data, size_t size, uint64_t pts, uint64_t dts)
{
#if VERSION_OK
auto l = m_output.lock();
if(l && data && size > 0) {
videocore::VideoBufferMetadata md(pts, dts);
l->pushBuffer(data, size, md);
}
#endif
}
复制代码
音频编码
AACEncode收到音频数据之后对数据进行编码,VideoCore使用的是IOS系统库AudioToolbox库进行AAC转码。
AACEncode::ioProc(AudioConverterRef audioConverter, UInt32 *ioNumDataPackets, AudioBufferList* ioData, AudioStreamPacketDescription** ioPacketDesc, void* inUserData )
{
UserData* ud = static_cast<UserData*>(inUserData);
UInt32 maxPackets = ud->size / ud->packetSize;
*ioNumDataPackets = std::min(maxPackets, *ioNumDataPackets);
ioData->mBuffers[0].mData = ud->data;
ioData->mBuffers[0].mDataByteSize = ud->size;
ioData->mBuffers[0].mNumberChannels = 1;
return noErr;
}
AACEncode::pushBuffer(const uint8_t* const data, size_t size, IMetadata& metadata)
{
const size_t sampleCount = size / m_bytesPerSample;
const size_t aac_packet_count = sampleCount / kSamplesPerFrame;
const size_t required_bytes = aac_packet_count * m_outputPacketMaxSize;
if(m_outputBuffer.total() < (required_bytes)) {
m_outputBuffer.resize(required_bytes);
}
uint8_t* p = m_outputBuffer();
uint8_t* p_out = (uint8_t*)data;
for ( size_t i = 0 ; i < aac_packet_count ; ++i ) {
UInt32 num_packets = 1;
AudioBufferList l;
l.mNumberBuffers=1;
l.mBuffers[0].mDataByteSize = m_outputPacketMaxSize * num_packets;
l.mBuffers[0].mData = p;
std::unique_ptr<UserData> ud(new UserData());
ud->size = static_cast<int>(kSamplesPerFrame * m_bytesPerSample);
ud->data = const_cast<uint8_t*>(p_out);
ud->packetSize = static_cast<int>(m_bytesPerSample);
AudioStreamPacketDescription output_packet_desc[num_packets];
m_converterMutex.lock();
AudioConverterFillComplexBuffer(m_audioConverter, AACEncode::ioProc, ud.get(), &num_packets, &l, output_packet_desc);
m_converterMutex.unlock();
p += output_packet_desc[0].mDataByteSize;
p_out += kSamplesPerFrame * m_bytesPerSample;
}
const size_t totalBytes = p - m_outputBuffer();
auto output = m_output.lock();
if(output && totalBytes) {
if(!m_sentConfig) {
output->pushBuffer((const uint8_t*)m_asc, sizeof(m_asc), metadata);
m_sentConfig = true;
}
output->pushBuffer(m_outputBuffer(), totalBytes, metadata);
}
}
复制代码
将音频数据编码后,传入下一个节点(Split—> AACPacketizer)。
推流
推流的逻辑在最后一个节点中处理(RTMPSession),关键代码如下:
RTMPSession::pushBuffer(const uint8_t* const data, size_t size, IMetadata& metadata)
{
if(m_ending) {
return ;
}
// make the lamdba capture the data
std::shared_ptr<Buffer> buf = std::make_shared<Buffer>(size);
buf->put(const_cast<uint8_t*>(data), size);
const RTMPMetadata_t inMetadata = static_cast<const RTMPMetadata_t&>(metadata);
m_jobQueue.enqueue([=]() {
if(!this->m_ending) {
static int c_count = 0;
c_count ++;
auto packetTime = std::chrono::steady_clock::now();
std::vector<uint8_t> chunk;
chunk.reserve(size+64);
size_t len = buf->size();
size_t tosend = std::min(len, m_outChunkSize);
uint8_t* p;
buf->read(&p, buf->size());
uint64_t ts = inMetadata.getData<kRTMPMetadataTimestamp>() ;
const int streamId = inMetadata.getData<kRTMPMetadataMsgStreamId>();
#ifndef RTMP_CHUNK_TYPE_0_ONLY
auto it = m_previousChunkData.find(streamId);
if(it == m_previousChunkData.end()) {
#endif
// Type 0.
put_byte(chunk, ( streamId & 0x1F));
put_be24(chunk, static_cast<uint32_t>(ts));
put_be24(chunk, inMetadata.getData<kRTMPMetadataMsgLength>());
put_byte(chunk, inMetadata.getData<kRTMPMetadataMsgTypeId>());
put_buff(chunk, (uint8_t*)&m_streamId, sizeof(int32_t)); // msg stream id is little-endian
#ifndef RTMP_CHUNK_TYPE_0_ONLY
} else {
// Type 1.
put_byte(chunk, RTMP_CHUNK_TYPE_1 | (streamId & 0x1F));
put_be24(chunk, static_cast<uint32_t>(ts - it->second)); // timestamp delta
put_be24(chunk, inMetadata.getData<kRTMPMetadataMsgLength>());
put_byte(chunk, inMetadata.getData<kRTMPMetadataMsgTypeId>());
}
#endif
m_previousChunkData[streamId] = ts;
put_buff(chunk, p, tosend);
len -= tosend;
p += tosend;
while(len > 0) {
tosend = std::min(len, m_outChunkSize);
p[-1] = RTMP_CHUNK_TYPE_3 | (streamId & 0x1F);
put_buff(chunk, p-1, tosend+1);
p+=tosend;
len-=tosend;
}
this->write(&chunk[0], chunk.size(), packetTime, inMetadata.getData<kRTMPMetadataIsKeyframe>() );
}
});
}
复制代码
将数据打包成RMPT协议的chunk后进行推流。