[Dart翻译]Dart VM FFI愿景-一一网

本文由简悦SimpRead 转码，原文地址 gist.github.com

GitHub Gist：即时分享代码、笔记和片段。

背景

Dart FFI项目（跟踪为问题#34452）的目的是提供一种低模板、低仪式和低开销的方式，与本地C/C++代码进行互操作。

这个项目背后的动机是双重的。

Flutter最常见的请求之一是要求提供一种与本地（C/C++）代码交互的低开销同步机制（见问题#7053）。
我们希望有一个替代Dart VM C的API，以反映Dart语言今天的样子和它的使用环境。

目前Flutter支持通过平台通道与用Java（Kotlin）和Objective-C（Swift）代码编写的平台特定代码进行交互。这种机制基于异步消息传递，需要人们同时用Dart和各自的平台语言编写胶水代码。这是一个高开销的解决方案，无论是在性能还是在程序员需要编写的模板代码方面。

Dart VM提供了一个定义在dart_api.h头中的C语言API，以及通过native extensions将Dart代码与本地C/C++代码绑定的机制。然而，这种机制并没有与Flutter集成，也不能开箱使用。

虽然有可能对Flutter引擎和工具进行必要的修改，使开发人员能够编写基于VM API的本地扩展（见Native Flutter Extensions Prototype doc），但我们认为使用Dart VM C API并不是未来的正确方式，原因如下。

C API是基于名称的，例如：Dart_Handle Dart_GetField(Dart_Handle container, Dart_Handle name);。
- 这使得它对AOT不友好。
- 这使得它很慢–因为名称解析的结果没有被缓存。
它是 reflective ：本地函数的签名是void (Dart_NativeArguments args)，这允许它们接受任何参数并返回任何结果，尽管Dart端的函数签名通常要严格得多，并为编译器提供足够的信息来自动执行必要的marshalling。解除参数和包装结果需要通过API边界进行多次往返–而且不能被Dart编译工具链所优化。FFI背后的一个核心思想是。

如果一个具有静态已知签名的本地函数被绑定到一个具有已知签名的Dart函数上，那么基于静态已知类型的参数和结果的编排比基于Dart C API的反射性编排更有效率。
它很冗长。

基于这些观察，我们还期望以更精简的方式与本地代码集成，这也应有利于Dart VM C API的现有用户–例如，我们期望将Flutter引擎从C API转移到FFI，应大大减少与跨越Dart和本地代码之间的界限有关的开销。

设计草图

关于类型系统的说明

一般来说，我们尽量将FFI的设计融入到现有的Dart类型系统中，以便像代码完成和静态错误这样的事情能够如期进行。

然而，从下面的章节可以看出，这并不总是可能的，通常是由于缺乏类型系统的功能，使我们无法将必要的信息编码到静态类型中并执行额外的类型规则。

这意味着FFI的实现有可能需要对Dart类型系统进行自己的扩展，并在CFE层面和分析器层面以额外的Kernel变换的方式执行规则。不幸的是，Dart前端的不完全统一意味着这项工作将不得不重复进行–就像其他语言功能的重复一样。

从Dart访问本地类型

FFI的第一个支柱是一种从Dart访问本地内存的方法。设计如何在Dart代码中表达这一点是受Dart语义限制的。

Dart类型是引用类型
本地类型和Dart内置类型之间的映射通常是多对一。例如，本地的int8_t和int32_t在Dart那边都对应于int类型。

指针和基元

library dart.ffi;

/// Classes representing native width integers from the native side.
/// They are not constructible in the Dart code and serve purely as
/// markers in type signatures.
class _NativeType { }
class _NativeInteger extends _NativeType { }
class _NativeDouble extends _NativeType { }
class Int8 extends _NativeInteger { }
class Int16 extends _NativeInteger { }
class Int32 extends _NativeInteger { }
class Int64 extends _NativeInteger { }
class Uint8 extends _NativeInteger { }
class Uint16 extends _NativeInteger { }
class Uint32 extends _NativeInteger { }
class Uint64 extends _NativeInteger { }
class IntPtr extends _NativeInteger { }
class Float extends _NativeDouble { }
class Double extends _NativeDouble { }
class Void extends _NativeType {}

// Note: do we need to have Char type?
// Note: do we need to have ConstPointer type that only supports loads?

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> extends _NativeType {
  /// Cast Pointer<T> to Pointer<U>.
  Pointer<U> cast<U extends _NativeType>();

  /// Pointer arithmetic (takes element size into account).
  Pointer<T> elementAt(int index);

  /// Pointer arithmetic (byte offset).
  Pointer<T> offsetBy(int offsetInBytes);

  /// Store a value of Dart type R into this location.
  void store<R>(R value);
  
  /// Load a value of Dart type R from this location.
  R load<R>();

  /// Access to the raw pointer value and construction from raw value.
  int toInt();
  factory fromInt(int ptr);
}
复制代码

注意store和load方法都有自己的类型参数R，表示存储/加载值的Dart表示。不幸的是，Dart类型系统不允许我们表达T和R的相互约束（例如，如果T扩展了_NativeInteger，那么R应该是int）–这必须由 “FFI-typing pass “来报告。

ffi.Pointer<ffi.Int32> ptr;
final i = ptr.load<int>();  // valid
final s = ptr.load<String>();  // compile time error 
复制代码

请注意，我们将依靠FFI类型传递来禁止使用Pointer<T>的方式，即T（或R）不是静态已知的。

// Compile time error: Pointer<T> has to be statically instantiated.
int load<T extends _NativeType>(Pointer<T> p) => p.load();

// Compile time error: R has to be statically instantiated.
R load(Pointer<Int32> p) => p.load();
复制代码

这个限制的存在是为了确保后端可以为指针加载生成最简单的、单态的代码。

注意： load<R>和store<R>可以成为Dart支持的扩展方法，那么你可以写出

on Pointer<T extends _NativeInteger> {
  int load();
  void store(int value);
}
复制代码

考虑到的`load<R>`/`store<R>`的替代方案

这里的一个问题是可以用Pointer<T>进行什么样的操作。一个明显的想法是允许通过这个指针加载和存储T类型的值。

abstract class Pointer<T extends _NativeType> {
  void store(T value);
  T load();
}
复制代码

但是这没有意义，因为这意味着Pointer<Int32>派生到Int32，而我们希望它派生到int–一个Dart程序员了解如何使用的类型。这类似于Int32List.[]如何返回int而不是Int32。

不幸的是，Dart的类型系统不允许我们写这样的东西。

abstract class Pointer<T extends _NativeType> {
  void store(Representation(T) value);
  Representation(T) load();
}
复制代码

其中Representation(T)是。

当T扩展到_NativeInteger时为int’;
当T扩展到_NativeDouble'时为double’;
当T扩展了指针'时为T’。

一个可能的方法是为这些不同的情况引入Pointer子类。

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> {
  /// Cast Pointer<T> to any other pointer type.
  P cast<P extends Pointer>();
}

abstract class IntPointer<T extends _NativeInteger> extends Pointer<T> {
  void store(int value);
  int load(int value);
}

abstract class DoublePointer<T extends _NativeDouble> extends Pointer<T> {
  void store(double value);
  double load(); 
}

abstract class PointerPointer<T extends Pointer> extends Pointer<T> {
  void store(T value);
  T load();
}
复制代码

另一种可能的方式是认为任何Pointer<T>都可以被转换为类型化的数组。

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> {
  U asList<U extends TypedData>();
}
复制代码

然后你可以写这样的代码。

import 'dart:ffi' as ffi;

ffi.Pointer<ffi.Void> ptr;
// Equivalent of ptr.cast<IntPointer<Int32>>.store(10);
ptr.asArray<Int32List>()[0] = 10; 
复制代码

注意： Dart类型系统不允许我们对U表达约束，以确保U是TypedData的具体子类，而不是比如TypedData本身。

这种方法导致了一个非常冗长的PointerPointer<IntPointer<Int32>>类型。

分配和释放内存

library dart.ffi;

/// Allocate [count] elements of type [T] and return a pointer
/// to the newly allocated memory.
Pointer<T> allocate<T extends _NativeType>({int count: 1});

/// Free memory pointed to by [p].
void free<P extends Pointer>(P p);

/// Return a pointer object that has a finalizer attached to it. When this
/// pointer object is collected by GC the given finalizer is invoked.
///
/// Note: the pointer object passed to the finalizer is not the same as 
/// the pointer object that is returned from [finalizable] - it points
/// to the same memory region but has different identity. 
Pointer<T> finalizable<T>(Pointer<T> p, void finalizer(Pointer<T> ptr))
复制代码

Structures/Unions

一般来说，只是指针本身就足以处理结构化数据。

import 'dart:ffi' as ffi;

/// Same as
///
///     struct Point { 
///       double x;
///       double y; 
///       Point* next;
///    };
///
class Point {
  final _ptr = ;

  Point.fromPtr(Pointer<ffi.Void> ptr) : _ptr = ptr.cast<ffi.Uint8>();
  
  Point(double x, double y, Point next) : 
    _ptr = ffi.allocate<ffi.Uint8>(
       count: ffi.sizeOf<ffi.Double>() * 2 + 
              ffi.sizeOf<ffi.Pointer<Void>>()) {
    this.x = x;
    this.y = y;
    this.next = next;
  }

  ffi.Pointer<ffi.Double> get _xPtr => 
    _ptr.offsetBy(0).cast<ffi.Double>();
  set x (double v) { _xPtr.store(v); }
  double get x => _xPtr.load();

  ffi.Pointer<ffi.Double> get _yPtr => 
    _ptr.offsetBy(ffi.sizeOf<ffi.Double>() * 1).cast<ffi.Double>();
  set y (double v) { _yPtr.store(v); }
  double get y => _yPtr.load();

  ffi.Pointer<ffi.Pointer<ffi.Void>> get _nextPtr =>
    _ptr.offsetBy(ffi.sizeOf<ffi.Double>() * 2).cast<ffi.Double>();
  set next (Point v) { _nextPtr.store(v._ptr); }
  Point get next => Point.fromPtr(_nextPtr.load()); 
}
复制代码

然而这种代码是非常冗长的，所以我们想把它隐藏在一层语法糖之下。核心思想是，我们使用正常的字段声明来描述布局，每个字段有两种类型与之相关。

字段的正常Dart类型指定类型如何暴露给Dart代码。
一个注释指定了相应字段的本地存储格式。

例如，像这样的声明。

import 'dart:ffi' as ffi;

@ffi.struct  // Specifies layout (either ffi.struct or ffi.union)
class Point extends ffi.Pointer<Point> {
  @ffi.Double()  // () are confusing :-(
  double x;
  
  @ffi.Double()
  double y;

  @ffi.Pointer()  // To distinguish from the case when one struct embeds
  Point next;     // another by value.
}
复制代码

可以通过前端的方式进行转换，与上面的更详细的声明相匹配。

注意： 这里有几个问题需要回答。

如何方便地将Pointer<Point>投到Point？
Point应该有什么样的构造函数？
…

结构布局和可移植性

结构布局在不同的平台之间本质上是不可移植的。例如，POSIX文件状态API使用的struct stat在Mac OS X和Linux上有不同的布局。

Dart没有相当于预处理器的东西，所以指定平台的布局需要一些其他的机制。

一个潜在的方法是这样的。

@ffi.struct({
  'x64 && linux': { // Layout on 64-bit Linux
    'x': ffi.Field(ffi.Double, 0),
    'y': ffi.Field(ffi.Double, 8),
    'next': ffi.Field(ffi.Double, 16)
  },
  'arm && ios': {  // Layout on 32-bit iOS
    'x': ffi.Field(ffi.Float, 4),
    'y': ffi.Field(ffi.Float, 8),
    'next': ffi.Field(ffi.Pointer, 0)
  },
})
class Point extends ffi.Pointer<Point> {
  double x;
  double y;
  Point next;
}
复制代码

函数类型

在我们深入研究如何在Dart中表示函数指针的细节之前，让我们概述一下从Dart调用本地函数和反之亦然的情况。

从Dart调用本地函数

要从Dart调用本地函数，我们需要： 1:

将传出的参数从其Dart表示法转换成本地表示法。
注意： 这里的一个重要决定是决定我们要允许多少自动参数集结，例如，”String “是自动转换为 “uint8_t*”还是程序员在调用函数时必须明确进行这种转换？
如果callee可以重新进入Dart（”非leaf”） ：记录退出框架信息，以便Dart GC能够找到它。

注意： 声明函数是一个叶子（=不会进入Dart代码）是一种优化，因为它简化了参数的编排，从Dart过渡到本地代码，也允许在这种函数调用中进行优化–因为这种函数不能影响纯Dart对象。如果一个函数是一个叶子，那么将 “String “转换为 “const uint8_t*”参数可能就像传递一个指针到 “String “的主体一样简单（如果字符串是一个字节的字符串）。

注意： 一般来说，这也意味着FFI不能（轻易）与本地非本地控制流（longjmp或异常）互通，当控制从一个本地框架转移到另一个本地框架时，会绕过夹在中间的Dart框架。(有一些方法可以实现与异常的互操作–但这些方法并不复杂，所以暂时不在讨论范围之内）。
根据被调用者的_调用惯例，在堆栈和寄存器中安排传出的参数。
调用被调用者。
当callee返回时，我们需要将结果转换为Dart表示，并拆掉退出框架。
注意： 这里的一个重要问题是如何表示由值返回的结构？[最接近的想法是在本地堆上分配它们，并返回一个带有finalizer的指针，而不是一个值]。

从本地调用Dart函数

从Native调用Dart Function与上面描述的过程没有什么不同–步骤只是有些颠倒了。只有几个问题需要回答。

我们是否允许调用 closures 和 class方法 或者我们限制自己使用 static 函数？
如果允许，这些在本地代码中如何表示，接收者在本地代码中如何表示。(注意：之前我们只谈到了来回传递本地数据。将Dart对象传入本地代码需要一个句柄系统，这样GC就会知道）。
我们是否希望调用Dart函数的线程被连接到 isolate（例如通过Dart_EnterIsolate API调用）？我们是否要防止用户滥用FFI并试图在一个错误的线程上调用一个函数（例如回调）的可能性？FFI的结构是否应该突出这种错误的可能性，并允许报告它 – 或者我们应该直接崩溃？

表示函数指针

想象一下，我们想把这段代码转换为Dart FFI。

typedef int32_t (*binary_t)(int32_t x, int32_t y); 
struct Ops {
  binary_t add;
  binary_t sub;
};

// Invoke by pointer
int32_t invoke(binary_t f, int32_t x, int32_t y) {
  return f(x, y);
}
复制代码

我们可以遵循我们对字段的设计：使用两种类型的组合，一种是描述函数指针的本地性质，另一种是描述它在Dart中的使用方式。例如，我们可以扩展Pointer类，用一种方法将其强制为Dart函数，同时创建NativeFunction类，代表本地函数的类型。

library dart.ffi;

abstract class Pointer<T extends _NativeType> {
  // Should only be valid if T is a function type. Creates a function that
  // will marshall all incoming parameters, perform an invocation via
  // this pointer and then unmarshall the result. 
  U asFunction<U extends Function>();
} 

class NativeFunction<T extends Function> extends _NativeType {
}
复制代码

可以像这样使用。

import 'dart:ffi' as ffi;

typedef ffi.Int32 NativeBinaryOp(ffi.Int32, ffi.Int32);
typedef int BinaryOp(int, int);

@ffi.struct 
class Ops extends ffi.Pointer<Ops> {
  // Front-end ensures that type of the annotation is 
  @ffi.NativeFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32)>()
  BinaryOp add;

  @ffi.NativeFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32)>()
  BinaryOp sub;
}

// Invoke by pointer. Note: have to write ffi.Pointer<NativeFunction<...>>
// because Pointer constraints T to be a subtype of _NativeType.
void invoke(ffi.Pointer<NativeFunction<NativeBinaryOp>> op, int x, int y) {
  op.asFunction<BinaryOp>()(x, y);
}
复制代码

注意： 我们希望代码可以被AOT编译，我们将指定调用Pointer<F>.asFunction<G>()只依赖于F和G的静态值，而不是接收器的重化类型–否则我们不能预编译所有必要的marshalling stubs。 (指定通用不变性/精确性的语言特性在这里会很有用)。

这看起来比较干净，但不幸的是，它没有捕捉到一些必要的信息。

调用惯例。
函数是否是一个叶子。

不幸的是，目前还不完全清楚什么是将这些信息编码到指针类型中的最佳方式。一个可能的方法是像这样做。

class _CallingConvention {}
class Cdecl extends _CallingConvention {}
class StdCall extends _CallingConvention {}

class _Leafness {}
class Leaf extends _Leafness {}
class NotLeaf extends _Leafness {}

class NativeFunction<T extends Function, 
                     CC extends _CallingConvention, 
                     L extends _Leafness> extends _NativeType {
}
复制代码

但这可能太啰嗦了（尤其是Dart不支持类型参数值的默认值）。

内置类型和本地类型之间的转换

TODO(vegorov)描述了可以用来转换指针和字符串、指针和数组等的助手；注意：可以使用外部类型的数据和字符串进行有效转换。

将Dart函数转换为函数指针

如果本地函数需要你传入一个回调，怎么办？

typedef intptr_t (*callback_t)(void* baton, void* something);
void with_something(callback_t cb, void* baton);
复制代码

如果我们想从Dart中调用这个，我们如何传递一个函数呢？

为了简单起见，最初我们应该只允许传入 静态方法 –这一点实现起来非常简单，因为静态方法可以简单地有重定向的蹦蹦跳跳。

对于允许将 batons 与回调联系起来的API，用户可以使用 handmade persistent handles 来传递闭包，其思路是这样的。

typedef int Callback(ffi.Pointer<ffi.Void> something);

int _id = 0;
final _i2cb = <int, Callback>{};
final _cb2i = <Callback, int>{};

int _trampoline(ffi.Pointer<ffi.Void> baton, ffi.Pointer<ffi.Void> something) {
  _i2cb[baton.toInt()](something);
}

ffi.Pointer<ffi.Void> _toHandle(Callback cb) {
  return ffi.Pointer<ffi.Void>.fromInt(_cb2i.putIfAbsent(cb, () {
    _i2cb[_id] = cb;
    return _id++;
  }));
}

void withSomething(Callback cb) {
  with_something(_trampoline, _toHandle(cb));
}
复制代码

请注意，这将会泄露内存–所以这真的只适用于单枪匹马或支持取消注册的API。

对于没有棍子的API来说，仍然有一种方法可以将闭包作为函数指针来传递–通过为每个不同的闭包设置一个闭包专用的蹦床，然而这只有在传递给对方的闭包数量较少的情况下才有效（因为AOT必须预先生成固定数量的蹦床），而且也只有在支持注册和注销的API上才真正有效。

将本地代码绑定到Dart方法上

上一节已经介绍了通过函数指针从Dart调用本地代码的可能性。因此，如果dart:fi库提供了dlopen/dlsym这样的基元，就已经足以跨越这个方向的界限。

library dart.ffi;

class DynamicLibrary {
  // Equivalent of dlopen
  factory DynamicLibrary.open(String name);

  // Equivalent of dlsym
  Pointer<SymbolType> lookup<SymbolType extends _NativeType>(String symbolName);

  // Helper that combines lookup and cast to a Dart function.  
  // Note: user code is would not be permitted to be generic like this.
  // However FFI own code can.
  // Note: ignoring leafness and calling convention for brevity.
  F lookupFunction<SymbolType extends Function, F extends Function>(String symbolName) {
    return lookup<SymbolType>(symbolName)?.asFunction<F>();
  }
}
复制代码

import 'dart:ffi' as ffi;

// Invoke int32_t add(int32_t, int32_t) from library libfoo.so
final lib = DynamicLibrary.open('libfoo.so');
final add = lib.lookupFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32), int Function(int, int)>('add');
print(add(1, 2));
复制代码

然而这种风格的代码是不必要的冗长，所以我们还应该提供一种声明性的方式，将Dart函数绑定到本地函数。比如说

library dart.ffi;

/// An annotation that can be used to make FE/VM generate binding code
/// between an extern static method declaration and native code.
class Import<NativeType> {
  /// Native library that contains the target native method.
  /// Can be null - then the symbol is resolved globally.
  final String library;

  /// Symbol to bind to.
  final String symbol;

  /// Specifies whether the target function is expected to call 
  /// the Dart code back.
  final bool isLeaf;

  final callingConvention;

  const Import({
    this.library,
    this.symbol,
    this.isLeaf: true,
    this.callingConvention: Cdecl  // Note: Cdecl is a Type literal.
  });
}
复制代码

import 'dart:ffi' as ffi;

@ffi.Import<ffi.Int32 Function(ffi.Int32, ffi.Int32)>(
  library: 'foo',  // Q: should mangle library name in platform specific way?
  symbol: 'add',
)
extern int nativeAdd(int a, int b);

@ffi.Import<ffi.Int32>(symbol: 'g_counter')
extern int globalCounter;
复制代码

这里的核心思想是引入注解ffi.Export。

library foo;

@ffi.Export<ffi.Int32 Function(ffi.Int32, ffi.Int32)>(symbol: 'add')
int add(int a, int b) => a + b;
复制代码

这个注解将指示VM生成一个外部可调用的跳板，其对应的本地签名为int32_t (int32_t, int32_t)。

然后，在本地代码中，开发人员可以做。

typedef int32_t (*add_t)(int32_t, int32_t);
add_t f = Dart_LookupFFIExport("foo", "add");
f(1, 2);
复制代码

请注意，这还可以更进一步–我们可以有一个工具，从注解中生成绑定模块，其中包含以下代码。

#if defined(DART_AOT_USING_DLL)
// AOT compiler would generate a symbol that can be hooked up by 
// the normal dynamic linkage process.
extern "C" int32_t dart_foo_add(int32_t x, int32_t y);
#else
// In JIT or blob based AOT we have to lookup dynamically.
int32_t dart_foo_add(int32_t x, int32_t y) {
  static int32_t (*f) (int32_t, int32_t) = Dart_LookupFFIExport("foo", "add");
  return f(x, y);
}
#endif
复制代码