中英阅读-3D内容流水线-一一网

这篇博文是节选自书籍《Programming 3D Applications with HTML5 and WebGL》的一个章节，做了一些翻译，介绍3D内容流水线。
以及我之前比较关注的一个部分：3D内容的格式。

原文

Excerpts from：《Programming 3D Applications with HTML5 and WebGL》

The 3D Content Pipeline

In the early days of the Web, if you knew how to write markup, you were a content creator. There was no Dreamweaver WYSIWYG editing; no Photoshop tool for slicing images. The task was left, largely, to programmers—and the Web looked like it. Eventually, the makers of professional authoring software developed tools for creating webready content. Once the tools were in place, artists and designers assumed the content responsibilities, and the Internet transformed into a consumer-grade experience.

WebGL development is going through an evolution similar to those early days of the Web. For the first few years of the technology’s existence, content was created by hand by programmers typing into text editors, or cobbled from whatever 3D format they could find a converter for. If a converter didn’t exist, you would write one to get the project done.

Fortunately, the situation is changing rapidly. Three.js and other WebGL libraries are getting better at importing content created by professional tools. The industry is also pulling together to create new 3D file format standards designed specifically for web use. The content creation landscape is still a bit rocky, but at least we have moved beyond the “stone tools” stage of a few years ago into more of a Bronze Age of 3D development.

This chapter covers the 3D content pipeline for web development. First, we will look at the overall content creation process. You may find this useful if you are new to 3D authoring. Then, we survey popular modeling and animation tools being used in today’s WebGL projects, and dig into the details of the 3D file formats that are best suited to web deployment. Finally, we will learn how to load those files into applications using Three.js utilities, in preparation for projects to come in following chapters.

The 3D Creation Process

3D content creation involves a set of highly specialized disciplines. Professional careers in 3D require extensive training and a deep understanding of complex authoring tools and workflows. Often, one 3D artist does everything, including modeling, texture mapping, and animating. But sometimes, especially on bigger projects, people specialize.

In many ways, 3D content creation is similar to making 2D art with Photoshop or Illustrator. But 3D authoring is also different from 2D art creation in a few fundamental respects. Even if you consider yourself a technical person, if you are planning on a developing a 3D project, it’s good to know what it takes to make the content that goes into it. With that in mind, let’s take a look at the basic steps involved in the 3D creation process.

Modeling

3D model creation typically starts with a sketch by the artist. Before long, a modeling package is used to turn that sketch into a digital representation in 3D. Models are usually created as 3D polygonal meshes, drawn first as wireframes and then shaded with materials. This activity is known as 3D modeling, and the person who does it for a living is called a modeler. Figure 8-1 depicts a basic model of a teapot, created with Autodesk 3ds Max. The model is seen from four different views: top, left, front, and perspective.

Figure 8-1. 3D modeling in 3ds Max with top, front, left, and perspective views

(image ©Autodesk, from the main Wikipedia entry on 3ds Max)

Texture Mapping

Texture mapping, also known as UV mapping, is the process of creating 2D art to wrap onto the surface of a 3D object. Modelers often do their own texture mapping, though in larger projects the responsibilities may be divided, and specialized texture artists do the texturing. Texture mapping is usually done with assistance from a visual tool built directly into the modeling package. The tool allows the artist to associate vertices of the mesh with positions on the 2D texture map while providing visual feedback. Figure 8-2 depicts texture mapping, where we see the map on the left; the combined view is on the bottom right and overlays vertex positions with the image data; and the resulting preview is on the top right. Note the somewhat counterintuitive layout of the image data on the left. Only half the face is shown. This is because, in the case of this texture map, the left and right sides of the face are mirror images. This strategy allows the artist to pack more data into less space and/or use other parts of the image for additional detail.

Figure 8-2. Texture mapping: a 2D image is wrapped and reflected onto the surface of a 3D object

(image courtesy Simon Wottge)

The process of creating 3D animations ranges from easy to extremely difficult, depending on the task. Key frame animating tends to be simple, at least in concept. The interfaces can get tricky to use and cluttered visually. A key frame editor, like the one depicted in Figure 8-3 from Autodesk Maya, contains a set of timeline controls (highlighted in the red rectangle near the bottom of the Maya window) that allow the artist, also known as the animator, to move or otherwise changes the object in the view, and then identify and click on positions in the timeline to define the key frames. Key frames can be used to change translation, rotation, scale, and even light and material attributes. When an animator wants to key frame more than one attribute, he or she adds another track tothe animation timeline. The animator lays out tracks in the interface by stacking them, which is what can lead to the visual clutter.

Animating

Animating characters with skinning is much more involved. Before the character can be animated, a set of bones, or rig, must be created. The rig determines various aspects of how the skin moves in response to movements of the bones. Rigging, or the process of creating the rig, is a very specialized skill. Often, different artists do the character animation and rigging.

Figure 8-3. Maya’s animation timeline tool, with controls for key frames animating translation, rotation, scale, and other attributes (image courtesy UCBUGG Open Course Ware)

Technical Art

We may not think of programming as a content creation activity, but in 3D development it often is. Complex special effects, such as certain shaders and post-processing techniques, can require the skills of an experienced programmer. In game and animation shops, this job falls to a technical artist (TA) or technical director (TD). There is no formal definition of these job positions, or strict difference between the two positions; though as the name implies, the TD is usually a more senior and experienced person. TDs write scripts, rig characters, write converter programs to get art from one format into another, implement special effects, develop shaders—in other words, all the stuff that is too technical for a visual artist to handle. It is a highly valued set of skills, and to many producers, good TDs are worth their weight in gold.

Given that they program for a living, TDs’ tool of choice is usually a text editor. However, there are now some interesting visual development tools for creating shaders and special effects. One example is ShaderFusion, a recently released visual tool for use with the Unity game engine. ShaderFusion allows the developer to develop shaders by defining data flows between one object’s outputs (such as time or position) and another object’s inputs (e.g., color and refraction). The interface is depicted in Figure 8-4.

Figure 8-4. ShaderFusion, a visual shader editor for the Unity3D engine

3D File Formats

There have been many 3D file formats developed over the years—so many that an exhaustive list would not be possible here. Some 3D formats have been designed to store files for a single authoring package; others have been designed to exchange data between packages. Some formats are proprietary—that is, completely controlled by a single company or software vendor—whereas others are open standards defined by an industry group. Some 3D file formats are entirely text-based and, therefore, human-readable, while others use a binary representation to save space.

3D file formats fall into three general categories: model formats, used to represent single objects; animation formats for animating key frames and characters; and full-featured formats that support entire scenes, including multiple models, transform hierarchy, cameras, lights, and animations. We will look at each of these kinds of formats, with a special emphasis on the ones that are best suited for web-based applications.

Model Formats

Single-model 3D formats are used extensively for interchange between different packages. Most modeling packages, for example, can import and export the OBJ format (see next section). Because they tend to have a simple syntax and only a few features, it is easy to implement support for them, and their use is prevalent. They do, however, tend to be quite limited in the features they support.

Wavefront OBJ

The OBJ file format, originally developed by Wavefront Technologies, is one of the oldest and best-supported single-model formats in the industry. It is extremely simple, supporting only geometry (with the associated vertices, normals, and texture coordinates).Wavefront introduced the companion MTL (Material Template Library) format for applying materials to geometry.

Example 8-1 illustrates the basics of an OBJ file, an excerpt from the classic “ball chair” model that we will be loading with Three.js later in the chapter (and depicted in Figure 8-12 later in the chapter). The OBJ file is packaged with the code examples in the file models/ball_chair/ball_chair.obj. Let’s have a look at the syntax. The # character is used as a comment delimiter. The file consists of a series of declarations. The first declaration is a reference to the material library stored in the associated MTL file. After that, several geometry objects are defined. This excerpt shows a partial listing of the definition for the object shell, the outer shell of the ball chair. We define the shell by specifying vertex position, normal, and texture coordinate data, one entry per line, followed by face data, also one per line. Each vertex of the face is specified by a triple in the form v/vt/vn, where v is the index of the previously supplied vertex position, vt the index of the texture coordinate, and vn the index of the vertex normal.

Example 8-1. A model in Wavefront OBJ format

# 3ds Max Wavefront OBJ Exporter v0.97b - (c)2007 guruware

# File Created: 20.08.2013 13:29:52

mtllib ball_chair.mtl

#

# object shell

#

 

v −15.693047 49.273174 −15.297686

v −8.895294 50.974277 −18.244076

v −0.243294 51.662109 −19.435429

... more vertex positions here

vn −0.537169 0.350554 −0.767177

vn −0.462792 0.358374 −0.810797

vn −0.480322 0.274014 −0.833191

... more vertex normals here

vt 0.368635 0.102796 0.000000

vt 0.348531 0.101201 0.000000

vt 0.349342 0.122852 0.000000

... more texture coordinates here

g shell

usemtl shell

s 1

f 313/1/1 600/2/2 58/3/3 597/4/4

f 598/5/5 313/1/1 597/4/4 109/6/6

f 313/1/1 598/5/5 1/7/7 599/8/8

f 600/2/2 313/1/1 599/8/8 106/9/9

f 314/10/10 603/11/11 58/3/3 600/2/2

... more face definitions here
复制代码

The material definitions that accompany the ball chair are in the MTL file models/ball_chair/ball_chair.mtl. The syntax is very simple; see Example 8-2. A material is declared with the newmtl statement, which contains a handful of parameters used to Phong shade the object: specular colors and coefficients (Ks, Ns, and Ni keywords), diffuse color (Kd), ambient color (Ka), emissive color (Ke), and texture maps (map_Ka and map_Kd). The texture map model for MTL has evolved over the years to include bump maps, displacement maps, environment maps, and other types of textures. In this example, only the diffuse and ambient texture maps are defined for the shell material.

Example 8-2. Material definitions for Wavefront OBJ format

newmtl shell

Ns 77.000000

Ni 1.500000

Tf 1.000000 1.000000 1.000000

illum 2

Ka 0.000000 0.000000 0.000000

Kd 0.588000 0.588000 0.588000

Ks 0.720000 0.720000 0.720000

Ke 0.000000 0.000000 0.000000

map_Ka maps\shell_color.jpg

map_Kd maps\shell_color.jpg

...
复制代码

STL

Another simple, text-based, single model format is STL (for StereoLithography), developed by 3D Systems for rapid prototyping, manufacturing, and 3D printing. STL files are even simpler than OBJ. The format supports only vertex geometry—no normals, texture coordinates, or materials. Example 8-3 shows a fragment from one of the Three.js example STL files (examples/models/stl/pr2_head_pan.stl). To see the file in action, open

the Three.js example file examples/webgl_loader_stl.html. STL is an excellent candidate 3D format for building online 3D printing applications in WebGL, because the files can potentially be sent directly to 3D printing hardware. In addition, it loads easily and renders quickly.

Example 8-3. The STL file format

solid MYSOLID created by IVCON, original data in binary/pr2_head_pan.stl

facet normal −0.761249 0.041314 −0.647143

outer loop

vertex −0.075633 −0.095256 −0.057711

vertex −0.078756 −0.079398 −0.053025

vertex −0.074338 −0.088143 −0.058780

endloop

endfacet

...

endsolid MYSOLID
复制代码

STL is such an easy and popular format that GitHub has actually added STL viewing directly into its interface. The viewer is built in WebGL, using our old friend Three.js.

For technical details on the STL format, visit the Wikipedia page.

Animation Formats

The formats described in the previous section represent static model data only. But much of the content in a 3D application is moving around on the screen (i.e., animated). A few specialty formats have evolved to deal with representing animated models. These include the text-based—and therefore web-friendly—formats MD2, MD5, and BVH.

id Software animation formats: MD2 and MD5

A couple of 3D formats that you will see crop up in web use from time to time are the animation formats for id Software’s popular Doom and Quake franchises. The MD2 format and its successor, MD5, are formats that define character animation. While the formats are essentially controlled by id, their specifications were released long ago, and many tools have been written to import them.

The MD2 format, created for Quake II, is a binary file format. It supports vertex-based character animation only via morph targets. MD5 (not to be confused with the Message Digest algorithm, a cryptographic hash function used widely on the Web) was developed for Quake III and introduced skinned animation and a text-based, human-readable format.

Excellent documentation on the MD2 and MD5 specifications can be found online.

To use these formats in WebGL applications, we could write a loader that reads them directly, or if using a library like Three.js, we can use a converter. When an MD2 file is converted to JSON, the format looks something like the example from Chapter 5, depicted in Figure 5-11. As a refresher, run the Three.js example located at examples/webgl_morphtargets_md2_control.htm, and have a look at the source code. There is a lot going on to load and interpret MD2 data.

Three.js does not come with an MD5 loader as part of the example set. However, there is a wonderful online converter from MD5 to Three.js JSON that was written by Klas (OutsideOfSociety) of the Swedish web agency North Kingdom (developers of Find Your Way to OZ). To see already-converted models in action, go to Klas’s blog and open this link. You should see a fairly detailed model of a monster, with controls for starting the various gesture animations.

To run the converter on your own MD5 files, you can open this link, which lets you drag and drop MD5 files into the view window, and produces JSON code.

BVH: The motion capture data format

Motion capture, the process of recording the movement of objects, has become a very popular way to create content, especially animations of people. It is used extensively in film, animation, military, and sports applications. Motion capture is widely supported in open formats, including the Biovision Hierarchical Data format, or BVH. BVH was developed by the motion capture company Biovision to represent movements in the animation of human characters. BVH is a very popular, text-based format supported as an import and export format by many tools.

Developer Aki Miyazaki has created an early experiment to import BVH data into WebGL applications. His BVH Motion Creator, a web-based BVH preview tool written using Three.js, is depicted in Figure 8-11. BVH can be uploaded, and its animations previewed on the simple character.

Figure 8-11. BVH Motion Creator, a previewer for motion capture files in BVH format

Full-Featured Scene Formats

Over the years, a few standard formats have been developed by the industry to support representing the entire contents of a 3D scene, including multiple objects, transform hierarchy, lights, cameras, and animations—essentially anything created by an artist in a full-featured tool like 3ds Max, Maya, or Blender. In general, this is a much harder technical problem to solve, and few formats have survived to enjoy widespread use. This situation may change, however, with WebGL driving new requirements for reuse of content and interoperability between applications. In this section, we look at a few potential full-scene formats for use with WebGL.

VRML and X3D

Virtual Reality Markup Language (VRML) is the original text-based standard for 3D on the Web, created in 1994 by a group that includes inventor and theorist Mark Pesce, members of the Silicon Graphics Open Inventor software team, and myself. VRML went through a couple of iterations in the 1990s, enjoying broad industry backing and the support of a nonprofit standards consortium. A successor featuring XML-based text representation was developed in the early 2000s, and renamed as X3D. While these standards are no longer widely deployed in web applications, they are still supported by most modeling tools as import and export formats.

VRML and X3D define full scenes, animation (key frames, morphs, and skinning), materials, lights, and even scripted, interactive objects with behaviors. Example 8-4 shows the X3D syntax for creating a scene with a red cube that will make a full rotation about the y-axis in two seconds when clicked. The geometry, behavior, and animations are all in this single XML file with an intuitive, human-readable syntax. To this day, there is no other open-standard 3D file format that can express all this functionality in such a simple, elegant syntax (if I do say so myself).

Example 8-4. X3D sample: A red cube that rotates when clicked

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE X3D PUBLIC "ISO//Web3D//DTD X3D 3.0//EN"

"http://www.web3d.org/specifications/x3d-3.0.dtd">

<X3D profile='Interactive' version='3.0'

xmlns:xsd='http://www.w3.org/2001/XMLSchema-instance'

xsd:noNamespaceSchemaLocation =

' http://www.web3d.org/specifications/x3d-3.0.xsd '>

<head>

... <!-- XML meta information for X3D file goes here -->

</head>

<!--

Index for DEF nodes: Animation, Clicker, TimeSource, XForm

-->

<Scene>

<!-- XForm ROUTE: [from Animation.value_changed to rotation ] -->

<Transform DEF='XForm'>

<Shape>

<Box/>

<Appearance>

<Material diffuseColor='1.0 0.0 0.0'/>

</Appearance>

</Shape>

<!-- Clicker ROUTE: [from touchTime to TimeSource.startTime ] -->

<TouchSensor DEF='Clicker' description='click to animate'/>

<!-- TimeSource ROUTEs:

[from Clicker.touchTime to startTime ] [from fraction_changed to

Animation.set_fraction ] -->

<TimeSensor DEF='TimeSource' cycleInterval='2.0'/>

<!-- Animation ROUTEs:

[from TimeSource.fraction_changed to set_fraction ]

[from value_changed to XForm.rotation ] -->

<OrientationInterpolator DEF='Animation' key='0.0 0.33 0.66 1.0'

keyValue='0.0 1.0 0.0 0.0 0.0 1.0 0.0 2.1 0.0 1.0 0.0 4.2 0.0 1.0 0.0 0.0'/>

</Transform>

<ROUTE fromNode='Clicker' fromField='touchTime' toNode='TimeSource'

toField='startTime'/>

<ROUTE fromNode='TimeSource' fromField='fraction_changed'

toNode='Animation' toField='set_fraction'/>

<ROUTE fromNode='Animation' fromField='value_changed' toNode='XForm'

toField='rotation'/>

</Scene>

</X3D>
复制代码

The design of VRML embodies many key concepts of interactive 3D graphics, and for that reason, you might expect that it is well suited for WebGL use. However, the standard was developed in a pre-JavaScript, pre-DOM world, and also before the advent of many key hardware-accelerated graphics features in use today. At this point, in my humble opinion, VRML/X3D is too out of date to consider for practical use. At the same time, there are many ideas in there yet to be tapped for use in WebGL, so it is a great area for study and inspiration.

Over the years, a wealth of VRML and X3D content has been developed. The folks at the German-based Fraunhofer Institute continue to soldier down the X3D path and are now creating X3DOM, a library for viewing X3D content using WebGL, without the

need for a plugin. For more information on X3DDOM, go to www.x3dom.org/. The VRML and X3D specifications may be found online.

COLLADA: The digital asset exchange format

In the mid-2000s, as VRML began showing signs of age, a group of companies, including Sony Computer Entertainment, Alias Systems Corporation, and Avid Technology, teamed up to develop a new format for exchanging 3D digital assets among games and interactive 3D applications. Rémi Arnaud and Mark C. Barnes of Sony led the design of the format, named COLLADA (for COLLAborative Design Activity). After the initial specification work and support from individual companies, development of the standard was turned over to the Khronos Group, the same nonprofit organization that develops WebGL, OpenGL, and other graphics hardware and software API standards.

COLLADA, like X3D, is a full-featured, XML-based format that can represent entire scenes, with a variety of geometry, material, animation, and lighting types. Unlike X3D, the goal of COLLADA is not to deliver an end-user experience complete with behaviors and runtime semantics. In fact, it is a stated nongoal of the technology. Rather, COLLADA is intended to preserve all of the information that could be exported from a 3D authoring tool so that it can be used downstream in another tool, or imported into a game engine or development environment before being deployed into the final application. The main idea was that, once COLLADA was widely accepted by the industry, the makers of various DCC tools would not have to worry about writing exporters to custom formats ever again; export to COLLADA, and, in theory, any package could import it.

Example 8-5 shows an excerpt from a COLLADA scene that we are going to load with Three.js later in this chapter. As we walk through it, there are several things to note about the structure of COLLADA files. First, all constructs are organized into libraries — collections of types such as images, shaders, and materials. These libraries usually come first in the XML definition, to be later referenced by constructs that need them (for example, images used in a material definition). Second, note that there are explicit declarations of what would normally be considered a built-in function, such as Blinn shading. COLLADA assumes nothing about shading and rendering models; it simply stores that data so that another tool can get the information and try to do something with it. Then, we see the vertex data for a mesh, expressed as a series of float_array elements. Finally, the mesh is assembled into a scene the user can see by referencing previously defined geometry and materials (using the instance_geometry, bind_material, and instance_material XML elements).

Example 8-5. COLLADA file structure, sample libraries, geometry, and scene

<?xml version="1.0"?>

<COLLADA xmlns="http://www.collada.org/2005/11/COLLADASchema"

version="1.4.1">

<asset>

<contributor>

<authoring_tool>CINEMA4D 12.043 COLLADA Exporter

</authoring_tool>

</contributor>

<created>2012-04-25T16:44:59Z</created>

<modified>2012-04-25T16:44:59Z</modified>

<unit meter="0.01" name="centimeter"/>

<up_axis>Y_UP</up_axis>

</asset>

<library_images>

<image id="ID5">

<init_from>tex/Buss.jpg</init_from>

</image>

... <!-- more image definitions here -->

</library_images>

<library_effects>

<effect id="ID2">

<profile_COMMON>

<technique sid="COMMON">

<blinn>

<diffuse>

<color>0.8 0.8 0.8 1</color>

</diffuse>

<specular>

<color>0.2 0.2 0.2 1</color>

</specular>

<shininess>

<float>0.5</float>

</shininess>

</blinn>

</technique>

</profile_COMMON>

</effect>

... <!-- more effect definitions here -->

<library_geometries>

<geometry id="ID56">

<mesh>

<source id="ID57">

<float_array id="ID58" count="22812">36.2471

9.43441 −6.14603 36.2471 11.6191 −6.14603 36.2471 9.43441 −9.04828

36.2471 11.6191 −9.04828 33.356 9.43441 −9.04828 33.356 11.6191

−9.04828 33.356 9.43441

... <!-- remainder of mesh definition here -->

...

<!-- define the scene as a hierarchy of nodes -->

<library_visual_scenes>

<visual_scene id="ID53">

<node id="ID55" name="Buss">

<translate sid="translate">5.08833 −0.496439

-0.240191</translate>

<rotate sid="rotateY">0 1 0 0</rotate>

<rotate sid="rotateX">1 0 0 0</rotate>

<rotate sid="rotateZ">0 0 1 0</rotate>

<scale sid="scale">1 1 1</scale>

<instance_geometry url="#ID56">

<bind_material>

<technique_common>

<instance_material

symbol="Material1" target="#ID3">

<bind_vertex_input

semantic="UVSET0"

input_semantic="TEXCOORD"

input_set="0"/>

</instance_material>

</technique_common>

</bind_material>

</instance_geometry>

</node>

... <!-- remainder of scene definition here -->
复制代码

After an initial period of high enthusiasm and broad vendor adoption, COLLADA support began to wane. Beginning around 2010, active development on exporter plugins for the popular DCC tools all but stopped. Recently, interest in COLLADA has picked up again, primarily due to the surge of support for WebGL—and the lack of a built-in file format for WebGL (more in this in a moment). There is a new open source project called OpenCOLLADA, with updated exporters for 3ds Max and Maya, from 2010 versions onward. It exports clean, standard-compliant COLLADA.

While improved COLLADA support is a boon to the 3D content pipeline, there is a problem. As we saw in the previous example, COLLADA is very verbose. The format was designed to preserve data, not to be fast to download and parse. That is why the Khronos Group has undertaken a new initiative that reimagines the best aspect of COLLADA—its full representation of rich, animated 3D scenes—into a new format designed for web delivery: glTF.

glTF: A new format for WebGL, OpenGL ES, and OpenGL applications

The rise in popularity of WebGL created a problem for web developers: the need to deliver full-scene content from 3D DCC tools into a running WebGL application. Singlemesh text formats such as OBJ are adequate for representing one object, but do not contain scene graph structure, lighting, cameras, and animation. COLLADA is fairly full-featured; however, as we saw in the previous section, it is verbose. In addition, it is represented in XML, requiring intensive CPU cycles to process into data structures suitable for rendering in WebGL. What was needed was a compact, web-ready format that requires minimal extra processing before rendering, something akin to a “JPEG for 3D.”

In the summer of 2012, Fabrice Robinet, an engineer at Motorola and chair of the Khronos COLLADA working group, began working on a 3D file format with the graphics features of COLLADA but with a more compact, WebGL-friendly representation. Originally, the project was dubbed COLLADA2JSON, the idea being that this would be a translation of the heftier XML syntax into lightweight JSON. Since then, the project has taken on a life of its own. Fabrice was joined by other contributors from the working group, including myself, COLLADA creator Remi Arnaud, and Patrick Cozzi, an engineer at defense software vendor AGI. Our mandate was expanded to broaden the scope beyond simple translation/optimization of COLLADA into a ground-up design of a new format for use with OpenGL-based applications for the Web and mobile, and glTF, the Graphics Library Transmission Format, was born.

glTF uses the full-featured nature of COLLADA as a jumping-off point, but it is a completely new format. The COLLADA feature set acts as a reference for the group to determine what sort of graphics features to support, but the details are completely different. glTF uses JSON files to describe scene graph structure and high-level information (such as cameras and lights), and binary files to describe rich data such as vertices, normals, colors, and animation. The binary format for glTF has been designed so that it can be loaded directly into WebGL buffers (typed arrays such as Int32Array and FloatArray). So, the process of loading a glTF file can be as simple as the following:

Read a small JSON wrapper file.
Load an external binary file via Ajax.
Create a handful of typed arrays.
Call WebGL drawing context methods to render.

Of course, in practice it is a bit more complicated. But this is far more efficient than downloading and parsing an XML file, and converting arrays of JavaScript Number types to typed arrays. glTF promises significant wins in both file size and speed of loading content—both critical factors in building high-performance web and mobile applications.

Example 8-6 shows the syntax of the JSON for a typical glTF scene, the famous COLLADA duck model. Note that there are structural similarities to COLLADA: libraries appear first, and we define a scene graph structure at the end by referencing elements in those libraries. But this is where the similarity ends. glTF dispenses with any information not absolutely required for runtime use, opting instead to define structures that will load quickly into WebGL and OpenGL ES. glTF defines in painstaking detail the attributes (vertex positions, normals, colors, texture coordinates, and so on) that are used to render objects with programmable shaders. Using this attribute information, a glTF application can faithfully render any meshes, even if it does not have its own sophisticated materials system.

In addition to the JSON file, glTF references one or more binary files (.bin extension) that store rich data (e.g., vertex data for meshes and animations) in structures called buffers and buffer views. Using this approach, we can stream, download incrementally, or load glTF content in one whack, as appropriate for the application.

Example 8-6. glTF JSON file format example

{

"animations": {},

"asset": {

"generator": "collada2gltf 0.1.0"

},

"attributes": {

"attribute_22": {

"bufferView": "bufferView_28",

"byteOffset": 0,

"byteStride": 12,

"count": 2399,

"max": [

96.1799,

163.97,

53.9252

],

"min": [

−69.2985,

9.92937,

−61.3282

],

"type": "FLOAT_VEC3"

},

... more vertex attributes here

"bufferViews": {

"bufferView_28": {

"buffer": "duck.bin",

"byteLength": 76768,

"byteOffset": 0,

"target": "ARRAY_BUFFER"

},

"bufferView_29": {

"buffer": "duck.bin",

"byteLength": 25272,

"byteOffset": 76768,

"target": "ELEMENT_ARRAY_BUFFER"

}

},

"buffers": {

"duck.bin": {

"byteLength": 102040,

"path": "duck.bin"

}

},

"cameras": {

"camera_0": {

"aspect_ratio": 1.5,

"projection": "perspective",

"yfov": 37.8492,

"zfar": 10000,

"znear": 1

}

},

... other high-level objects here, e.g., materials and lights

... finally, the scene graph

"nodes": {

"LOD3sp": {

"children": [],

"matrix": [

... matrix data here

],

"meshes": [

"LOD3spShape-lib"

],

"name": "LOD3sp"

},
复制代码

While the design focus of glTF is on compact and efficient representation of OpenGL data, the team has taken a balanced design approach that preserves other essential 3D data authored in DCC tools, such as animation, cameras, and lighting. The current version of glTF (version 1.0) supports the following features:

Meshes

Polygonal meshes made up of one or more geometry primitives. The mesh definition is in the JSON file, which references one or more binary data files that contain the vertex data.

Materials and shaders

Materials can be expressed as high-level common constructs (Blinn, Phong, Lambert), or implemented in GLSL vertex and fragment shaders that are included as external files referenced by the glTF JSON file.

Lights

Common light types (directional, point, spot, and ambient) are represented as highlevel constructs in the JSON file.

Cameras

glTF defines common camera types such as perspective and orthographic.

Scene graph structure

The scene is represented as a hierarchical graph of nodes (i.e., meshes, cameras, and lights).

Transform hierarchy

Each node in the scene graph has an associated transformation matrix. Each node can contain children; child nodes inherit their parents’ transformation information.

Animations

glTF defines data structures for key frame, skinned, and morph-based animations.

External media

Images and video used as texture maps are referenced via URL.

The glTF project, although executed under the auspices of the Khronos Group, is a completely open effort to which anyone can contribute. There is a source code repository on GitHub that includes working viewers and sample content, and the specification itself. Following a philosophy that we will standardize no features without first proving them in code, the team has already developed four independent glTF viewers, including one for use with Three.js (which we will look at shortly). For more information, see the main Khronos glTF page.

Autodesk FBX

There is one more full-featured scene format worth mentioning, at least in passing. The FBX format from Autodesk is a file format originally developed by Kaydara for use with MotionBuilder. After Autodesk acquired Kaydara, it began to use the FBX format in several of its products. At this point, FBX has become a standard for interchanging data between the various Autodesk products (3ds Max, Maya, and MotionBuilder).

FBX is a rich format that supports many 3D and motion data types. Unlike the other formats covered in this chapter, FBX is proprietary, completely controlled by Autodesk.Autodesk has documented the format, and provided SDKs to read and write FBX in C++ and Python; however, the SDKs require product licenses, which can represent a prohibitive cost for some. There have been successful FBX imports and exports written without the SDKs, such as for Blender, but it is not clear whether these can be used legitimately, given the terms of the FBX license.

Given the proprietary nature of the format, and the ambiguities around licensing, it may be wise to steer clear of FBX. On the other hand, it is a very powerful technology Given the proprietary nature of the format, and the ambiguities around licensing, it may be wise to steer clear of FBX. On the other hand, it is a very powerful technology.

译文

节选自：《HTML5与WebGL编程》

3D内容制作流程

在Web发展的早期，如果你知道如何写标签，你就是Web内容的创建者。当时没有Dreamweaver这类所见即所得的编辑器，也没有PhotoShop这类切图工具。在当时，Web内容创建的绝大部分工作都由程序员来完成——而那个时期的Web也恰恰反映出了这个特色。后来，软件开放商们发布了创建Web内容的专业编辑工具。当这些工具流行起来之后，艺术家和设计师们也逐渐承担起创建内容的责任，互联网的体验终于提升到消费级的水平。

WebGL开发正在经历和早年Web类似的演变。该技术最早出现的几年中，内容是有程序员利用文本编辑器手工编写创建的，或者使用能够找到的编辑器将其他3D格式的内容转换为WebGL所支持的格式。如果找不到现成的转换工具，那么为了完成项目，你可能需要自行编写一个。

幸运的是这种情况正在快速转变。Three.js和其他WebGL库在导入专业工具创建的3D内容这方面表现得越来越强大。业界还齐心协力创建了专供Web使用的3D文件格式标准。隋蓉从整体来说模式依旧比较原始，但对比几年前，WebGL内容的开放可以说已经从粗放的“石器时代”完成了向至少有可用工具的3D开发“青铜时代”的跨越。

本章涵盖了Web 3D内容的开发流程。首先，我们会通览内容创建的过程。如果你对3D创作不熟悉，那这部分内容对你来说会非常有用。其次我们会大体了解现今WebGL项目中使用的流行见面和动画工具，并深入研究最适合用于Web开发的3D文件格式细节。最后，我们会学习使用Three.js的通用工具讲这些文件加载进来，为后续章节的项目做准备。

3D内容创建过程

3D内容的创建涉及一系列高度专业化的学科。3D工作者的整个职业生涯需要广泛的培训以及对复杂创作工具和流程的深入理解。一名3D艺术家通常需要负责所有的事情，包括建模、纹理映射以及动画。但有时候,尤其是在比较大的项目中,人们会进行分工。

3D内容的创在很多方面与使用Photoshop和Illustrator来创建2D艺术作品相类似。但3D创作和2D艺术作品的创建有一些本质上的不同。即便你是一名技术人员,当你算开发个3D项目的时侯,最好还是对被导入到项目中的内容的产出过程有一定的了解。怀着这个目的,让我们来看看3D内容创建过程中的基本步骤。

建模

在3D模的创作中，通常先由艺术家绘制一个草图。然后，使用建模软件包将草图转换为3D数字表现。模型通常用3D多边形网格来表示:首先通过线框图来绘制,然后使用材质来着色。这个过程被称为3D建模(3D modeling),专门从事这项工作的人被称为建模师(modeler),图8-1描绘了一个由Autodesk 3ds Max创建的简单茶壶模型。我们可以从四个不同的视图观察该模型：顶部、左边、前面和透视图。

图8-1：3ds Max中的3D建模，有顶部、左边、前面和透视图四个视图

图片版权为Autodesk所有，来自维基百科词条

纹理映射

纹理映射(texture mapping,也称为纹理贴图)，也被称为UV映射(UV mapping，也称为UV贴图)，是将创建好的素材展开附着到3D物体表面上的过程。建模师通常会自己进行纹映射,但在比较大的项中：贴图这项职责可能会被划分出来:由专门的贴图师(texture artist)来负责，纹理映射一般由内建在建模软件包中的可视化工具来辅助完成。这类工具让贴图可以通过可视化的方式将2D纹理图利网格中的顶点关联起来并预览效果。图8-2描绘了纹理映射。图的左边是纹理；结合视图在右下方,可以看到图片数据覆盖在顶点的位置上：预览效果在右上方。注意左边的纹理图布局有点违反直觉:只显示了半边脸。这是因为对于这个纹理映射来说，左右脸是一样的。这个策略允许贴图师在较小的空间里放更多的效据，并且使用图片的其他部分来增加更多细节。

图8-2：纹理映射一个2D图像包裹并映射到一个3D物体的表面

图片由Simon Wottge提供

动画

创建3D动画的过程可以很简单也可以极为复杂，这取决于任务本身。关键帧动画往往比较简单，至少从理论上来说是这样的。界面可能会难以使用且视觉上杂乱。一个关键帧编辑器，就像图8-3 Autodesk Maya展示的那样,包含一时间轴控制器(在Maya窗口底部使用矩形高亮部分)，它允许动画师在视图中移动或改变物体，然后在时间轴上点击来创建关键帧。关键帧可以用来改变变换、旋转、缩放、甚至是光线及材质属性，当动画师需要在关键帧中定义多个属性的变换时，他需要在动画时间轴上增加额外的轨道。动画师在界面中通过堆叠的方式排列这些轨道，这会导致视觉上的混乱。

对有蒙皮的角色进行动画会牵涉更多的工作。在可以给角色设置动画前，需要先创建一系列的骨骼,也被称为骨架(rig)，骨架决定了蒙皮跟随骨骼移动时的各种特性。创建骨架的过程是需要专业技巧的,角色的动画及骨架经常由不同的动画师完成。

图8-3：Maya的动画时间轴工具，具有对变换、旋转、缩放及其他属性变换的关键帧的控制；图片来自UCBUGG Open Course Ware

技术美工

我们可能没想到编程也是内容创建活动的一部分，但在3D开发领域,它通常是。复杂的特殊效果、比如某些着色器和后期技术处理，需要有经验的程序员来完成。在游戏和动画公司里，这些工作被交给技术美工(technical artist)或技术总监(technical director)来做。对于这些职位并没有一个正式的定义，在这两个职位间也没有严格的区别。但根据名字来判断，技术总监通常是更资深和有经验的人。技术总需要写脚本、给角色创建骨架、写转换程序来将一种格式转另一种格式、实现特殊的效果、开发着色器，换句话来说就是所有那些对于视觉设计师来说太过于技术化的工作。这是高度有价值的技能，对于许多产品来说，一个好的技术总监价值连城。

因为技术总监主要靠写代码为生，他们所使用的工具通常是文本编辑器。但是,现在有些有趣的可视化开发工具供开发着色器和特殊效果使用。其中之一是ShaderFusion，一个最近在 Unity游戏引擎中发布的可视化具。ShaderFusion允许开发者通过定义从一个物体的输出(比如时间或位置)到另一个物体的输人(例如颜色或折射)之间的数据流来开发着色器。它的界面如图8-4所示。

图8-4: shaderFusion，一个Unity3D引擎中的可视化着色器编辑器

3D文件格式

这些年出现了许多3D文件格式，多到无法在这里全部介绍完。有些3D格式是某个3D软件自己用的，有些格式是用在多个软件间交换数据的。有些格式是私有的，也就是说被某个公司或软件厂商完全控制，有些则是由一个工业小组开发的开放标准。有些格式是基于文本的，所以可读，有些则为了节省空间使用二进制格式。

3D文件格式有三种分类：模型格式，用于表示单个模型；动画格式，用于动画的关键帧和角色定义；全功能洛式，包含整个场景，以及其中的多个模型、变换层级结沟、相机、光源及动画。我们会研究每种类型的格式，并重点研究适合Web应用的格式。

模型格式

单个模型的3D格式经常用来在不同的3D软件中交换数据。比如大部分3D软件都能导入和导出OBJ格式(接下来会介绍)。因为它语法简单且功能比较少，所以支持它很容易，因此很流行。然而，这也使得它所支持的功能不多。

1. Wavefront OBJ

OBJ文件格式由Wavefront公司开发，它是业界最早也是支持最广泛的单一模型格式。它非常简单，只支持几何体(有顶点、法线及贴图坐标)。 Wavefront还引入了一个辅助的MTL(Material Template Library，材质模板库)格式来支持给模型附上材质。

例8-1展示了一个基本的OBJ文件，它是从经典的“太空椅”模型中摘录出来的。这个模型我们在后面的章节会使用Three.js加载进来(效果见图8-12)。OBJ文件及相关代码在models/ball_chair/ball_chair.obj中，让我们看看它的语法。其中#字符用作注释定界符。这个文件包含一系列的声明。第一个声明是引用它所关联的材质库MTL文件。然后定义了几个几何体，这个摘录显示了用于定义名为shel1的对象的一部分数据，它是太空椅的外壳。我们通过每行一条条信息的方式，使用顶点位置、法线及贴图坐标数据来定义这个外壳，接下来是面的数据，同样是每个面一行。每个面的顶点通过类型为v/vt/vn的三个数来定义，其中v是顶点位置数据，vt是纹理坐标数据，而vn是顶点法线数据。

例8-1：一个 Wavefront OBJ格式的模型

# 3ds Max Wavefront OBJ Exporter v0.97b - (c)2007 guruware

# File Created: 20.08.2013 13:29:52

mtllib ball_chair.mtl

#

# object shell

#

v -15.693047 49.273174 -15.297686

v -8.895294 50.974277 -18.244076

v -0.243294 51.662109 -19.435429

... 其他顶点位置信息

vn -0.537169 0.350554 -0.767177

vn -0.462792 0.358374 -0.810797

vn -0.480322 0.274014 -0.833191

... 其他顶点法向量信息

vt 0.368635 0.102796 0.000000

vt 0.348531 0.101201 0.000000

vt 0.349342 0.122852 0.000000

... 其他顶点纹理坐标信息

g shell

usemtl shell

s 1

f 313/1/1 600/2/2 58/3/3 597/4/4

f 598/5/5 313/1/1 597/4/4 109/6/6

f 313/1/1 598/5/5 1/7/7 599/8/8

f 600/2/2 313/1/1 599/8/8 106/9/9

f 314/10/10 603/11/11 58/3/3 600/2/2

... 其他面片定义
复制代码

太空椅辅助的材质定义在models/ball_chair/ball_chair.mtl这个MTL文件中。它的语法非常简单，见例8-2。一个材质通过newmtl语句来定义，它包含了使用Phong着色的一些参数，包括高光反射颜色和系数(Ks、Ns和Ni关键字)，漫反射顏色(Kd)、环境颜色(Ka)、发光颜色(Ke)和纹理贴图(map_ka及Map_Kd)。MTL中的纹理贴图经过几年的发展，包括了凹凸贴图、置换贴图、环境贴图及其他类型的纹理。在这个例子中，shell纹理只定义了环境贴图和漫反射贴图。

例8-2：OBJ格式的材质定义

newmtl shell

Ns 77.000000

Ni 1.500000

Tf 1.000000 1.000000 1.000000

illum 2

Ka 0.000000 0.000000 0.000000

Kd 0.588000 0.588000 0.588000

Ks 0.720000 0.720000 0.720000

Ke 0.000000 0.000000 0.000000

map_Ka maps\shell_color.jpg

map_Kd maps\shell_color.jpg

...
复制代码

2. STL

另一个基于文本的简单单一模型格式是STL(StereoLithography)，它被开发用于3D系统的快速原型、制造业及3D打印。STL格式比OBJ更简单，这个格式只支持顶点几何，不支持法线、纹理坐标及材质。例8-3展示了一小段STL文件代码，它来自Three.js中的一个例子(examples/models/stl/pr2_head_pan.stl)。可以打开Three.js中的examples/webgl_loader_stl.html来查看运行时的样子：STL是基于WebGL开发在线3D打印应用的不错的格式：因为它能直接发到3D打印设备上。另外，它容易加载，且渲染速度很快。

例8-3：STL文件格式

solid MYSOLID created by IVCON, original data in binary/pr2_head_pan.stl

facet normal -0.761249 0.041314 -0.647143

outer loop

vertex -0.075633 -0.095256 -0.057711

vertex -0.078756 -0.079398 -0.053025

vertex -0.074338 -0.088143 -0.058780

endloop

endfacet

...

endsolid MYSOLID
复制代码

STL是如此简单和流行，GitHub已经增加了直接查看它的支持(github.com/blog/1465-s…

STL格式的技术细节可以参考维基百科(en.wikipedia.org/wiki/STL_%2…

动画格式

前面一节描述的格式只能用表示静态模型数据。但大部分3D应用中的内容在屏幕上都是会动的(即被动画了)。有几种专门用来表示动态模型的格式，包括基于文本(因此对Web友好)的MD2、MD5及BVH格式。

1. id Software动画格式：MD2和MD5

你会时不时在Web上看到用于id Software的流行游戏DOOM和Quake的专有格式。MD2及它后续的MD5格式，是用来定义角色动画的。这个格式被id SoftWare所控制，他们很早就公布了这个格式的规范，并且有许多工具支持导入它们。

MD2格式是为Quake II创建的，是一种二进制格式。它支持仅通过变形目标实现的基于顶点的角色动画。MD5(不要和在Web中广泛使用的加密散列函数“消息搞要”算法弄混了)是为Quake III开发的，它引入了蒙皮动画，并且是基于文本的可读格式。

MD2(tfc.duke.free.fr/coding/md2-…

要在WebGL应用中使用这些将式，我们可以自己编写一个加载器来直接读取它们，或者如果使用Three.js那样的库，我们可以使用转换工具。如果将MD2文件转成JSON格式，它看起来就像第5章中的例子那样，如图5-11所示。可以打开文件examples/webgl_morphtargets_md2_control.htm回顾一下，看看它的源码，你会发现加载和解释MD2文件需要做许多事情。

Three.js并没有在它的例子中包含MD5的加载器。但有一个不错的在线转换工具可以将MD5转成 Three.js JSON格式，它是由Klas(OutsideOfSociety)开发的，Klas就职于瑞典的网络公司North kingdom(Find Your Way to OZ的开发团队)。要查看一个转换后的模型，可以去Klas的博客，打开这个链接(oos.moxiecode.com/js_webgl/md…

md5_converter/)，它允许你拖拽一个MD5文件到窗口中，然后输出JSON代码。

2.BVH:动作捕捉数据格式

动作捕捉(motion capture)是一种记录物体运动的方式，它在制作人体动画方面非常流行。在电影、动画、军事及运动应用中广泛使用。动作捕捉得到了开源格式的广泛支持，共中包括Biovision Hierarchical Data(BvH)。BVH由Biovision公司开发，用于表示人体的运动。BVH非常流行，它基于文本，有很多工其支持导入和导出它。

开发者Aki Miyazaki做了一个将BVH数据导入到WEBGL应用中的早期尝试。他的BVH Motion Creator是一个基于Web的BVH预览工具，它本是基于Three.js开发的，如图8-11所示。借助这一工具，使用者可以上传BVH文件，预览它在简单角色上的动画效果。

图 8-11：BVH Motion Creator(www.akjava.com/demo/bvhpla…)

一个BVH格式的预览工具

全功能的场景格式

多年来，业界开发了几种标准格式来支持表示整个3D场景，包括多个模型、变换层级结构、相机、光源和动画，以及所有在3ds Max、Maya、Blender这样全功能的软件中创造出来的东西。总之，这依旧是有待解决的复杂的技术问题，几种格式存活了下来并得到了广泛使用。但局势可能会有所改变，因为WebGL带动了新的需求，尤其是需要在应用间重用数据。本节中，我们会研究几种在WebGL中有能使用的格式。

1. VRML和X3D

虚拟现实标记语言(Virtual Reality Markup Language，VRML)是最初用于3D Web的文本格式，它于1994年由发明家和理论家Mark Pesce、硅谷图形公司(Silicon Graphics) Open Inventor软件团队的成员，以及我所成的小组创建,VRML在20世纪90年代经历了几次迭代，得到了业界以及一个非营利组织标准协会的支持。它的后续格式基于XML，在21世纪初开发出来，名称改为了X3D。尽管这些格式在Web应用中已经不再广泛使用，大多数模型工具依然支持导入和导出它们。

VRML和X3D定义了完整的场景、动画(关键帧、变形及蒙皮)、材质、光源，甚至还有脚本化了的、具有行为的交互式物体。8-4展示了X3D语法，它创建了一个带有红色立方体的场景，当鼠标点击的时侯，它会绕着y轴做2秒钟的360°旋转。X3D的几何体、行为及动画都在一个XML文牛中，直观且可读性好。直到今天，连没有一个开放的3D文件格式可以用如此简单和优美的语法实现同样的能(我就卖自夸了)。

例8-4：X3D的例子，当点击的时侯红色立方体会旋转

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE X3D PUBLIC "ISO//Web3D//DTD X3D 3.0//EN"

    "http://www.web3d.org/specifications/x3d-3.0.dtd">

<X3D profile='Interactive' version='3.0'

     xmlns:xsd='http://www.w3.org/2001/XMLSchema-instance'

         xsd:noNamespaceSchemaLocation =

         ' http://www.web3d.org/specifications/x3d-3.0.xsd '>

<head>

… <!-- 此处为X3D文件的XML meta信息 -->

</head>

<!--

定义(DEF)节点索引：动画(Animation)，点击器(Clicker)，时间线(TimeSource)，变换矩阵(XForm)

-->

<Scene>

<!-- XForm ROUTE: [从Animation.value_changedڟrotation ] -->

<Transform DEF='XForm'>

<Shape>

<Box/>

<Appearance>

<Material diffuseColor='1.0 0.0 0.0'/>

</Appearance>

</Shape>

<!-- Clicker ROUTE: [从touchTimeڟTimeSource.startTime ] -->

<TouchSensor DEF='Clicker' description='click to animate'/>

<!-- TimeSource ROUTEs:

[从Clicker.touchTime到startTime ][从fraction_changed到Animation.set_fraction ]-->

<!-- Animation ROUTEs:

[从TimeSource.fraction_changed到set_fraction ]

[从value_changed到XForm.rotation] -->

<OrientationInterpolator DEF='Animation' key='0.0 0.33 0.66 1.0'

keyValue='0.0 1.0 0.0 0.0 0.0 1.0 0.0 2.1 0.0 1.0 0.0 4.2 0.0 1.0 0.0 0.0'/>

</Transform>

<ROUTE fromNode='Clicker' fromField='touchTime' toNode='TimeSource' toField='startTime'/>

<ROUTE fromNode='TimeSource' fromField='fraction_changed' toNode='Animation' toField='set_fraction'/>

<ROUTE fromNode='Animation' fromField='value_changed' toNode='XForm' toField='rotation'/>

</Scene>

</X3D>
复制代码

VRML的设计体现了交互式3D图形的许多关键概念，因此：你或许会认为它适合WebGL使用。但是，这个格式是在JavaScript和DOM都还没有出现前设计的，并且早于许多现今正在使用的硬件加速关键特性。

基于这一点，依我个人浅见，VRML/X3D过时了，不再适合实际使用。不过，它里面有许多可以借鉴到 WebGL中的想法。

多年来，出现了许多VRML和X3D格式的内容。德国的Fraunhofer研究所还在继续X3D的研究，并创建了X3DOM，一个使用WebGL显示X3D内容的查看器,更多关于X3DOM的信息，可以访问www.x3dom.org/。

VRML(www.web3d.org/standardsvr…

2. COLLADA:数字资源交换格式

2005年左右,VRML开始显得有些陈旧了，由包括Sony Computer Entertainment、Alias System和Avid Technology在内的一些公司组成了一个小组，一起开发一个3D数字资源格式，用于游戏和交互式3D应用中交换数据。索尼的Remi Arnaud和Mark C. Barnes领导开发了这个格式，命名为COLLADA(COLLAborative Design Activity)，在第一版规范及相关公司支持完成后，这个标准的开发移交给了Khronos组织，该非营利性组织还开发了WebGL、OpenGL，以及其他图形硬件及软件API的规范。

COLLADA和X3D一样，是一个基于XML的全动能格式，可以表示整个场景以及多种不同类型的几何体、材质、动画及灯光。和X3D不同的是,COLLADA的目标不是实现一个包含行为及运行时语义、面向最终用户的格式。事实上，它并没有明确的技术目标，而是试图完整保存从3D软件中可以输出的所有信息，使得它可以被后续的软使用,或者被游戏引擎及开发环境导入，然后部署到最终的应用中。它的主要想法是，一旦COLLADA被业界广泛接受，各种DCC工具厂商就不需要编写插件导出为其他自定义格式，只要支持导出为COLLADA，理论上任何其他软件都能导入。

例8-5展示了一个COLLADA场景的部分代码，我们将在本章后面使用Three.js加载它。这里简单讲解一下，COLLADA文件结构的几个特点值得一提。首先，它是以库的形式来组织结构的，这些库有不同的类型，比如图片、着色器和材质，这些库首先在XML中进行定义，然后被其他需要的部分引用(比如材质定义中使用的图片)。其次，注意它会显式声明一些函数，而这些函数通常情况下都被看作内置函数，比如Blinn着色器。COLLADA不对着色器和渲染模型作任何假设，它只是存储这些信息，让另一个工具拿到这些信息后再去作相应的处理。接着，我们发现网格顶点数据表示为一系列类型为float_array的元素。最终，通过引用了前面定义的几何体及材质(使用XML中的instance_geometry、bind_material及instance_material元素)，这些网格组装成了一个用户可见的场景。

例8-5: COLLADA文件的结构，包括本座、几何体及场景

<?xml version="1.0"?>

<COLLADA xmlns="http://www.collada.org/2005/11/COLLADASchema"

version="1.4.1">

<asset>

<contributor>

<authoring_tool>CINEMA4D 12.043 COLLADA Exporter

    </authoring_tool>

</contributor>

<created>2012-04-25T16:44:59Z</created>

<modified>2012-04-25T16:44:59Z</modified>

<unit meter="0.01" name="centimeter"/>

<up_axis>Y_UP</up_axis>

</asset>

<library_images>

<image id="ID5">

<init_from>tex/Buss.jpg</init_from>

</image>

...<!-- 其他图片的定义 -->

</library_images>

<library_effects>

<effect id="ID2">

<profile_COMMON>

<technique sid="COMMON">

<blinn>

<diffuse>

<color>0.8 0.8 0.8 1</color>

</diffuse>

<specular>

<color>0.2 0.2 0.2 1</color>

</specular>

<shininess>

<float>0.5</float>

</shininess>

</blinn>

</technique>

</profile_COMMON>

</effect>

    ...<!-- 其他滤镜的定义 -->

<library_geometries>

<geometry id="ID56">

<mesh>

<source id="ID57">

<float_array id="ID58" count="22812">36.2471

9.43441 -6.14603 36.2471 11.6191 -6.14603 36.2471 9.43441 -9.04828

36.2471 11.6191 -9.04828 33.356 9.43441 -9.04828 33.356 11.6191

-9.04828 33.356 9.43441

   ...<!--网格定义的的其余部分 -->

...

   <!-- 以节点层级结构的形式定义场景 -->

<library_visual_scenes>

<visual_scene id="ID53">

<node id="ID55" name="Buss">

<translate sid="translate">5.08833 -0.496439

-0.240191</translate>

<rotate sid="rotateY">0 1 0 0</rotate>

<rotate sid="rotateX">1 0 0 0</rotate>

<rotate sid="rotateZ">0 0 1 0</rotate>

<scale sid="scale">1 1 1</scale>

<instance_geometry url="#ID56">

<bind_material>

<technique_common>

<instance_material

symbol="Material1" target="#ID3">

<bind_vertex_input

semantic="UVSET0"

input_semantic="TEXCOORD"

input_set="0"/>

</instance_material>

</technique_common>

</bind_material>

</instance_geometry>

</node>

... <!-- 其余场景定义 -->
复制代码

经过了诞生初期高度的热情及广泛的厂商支持后，对COLLADA的支持开始衰退了。从大概2010年开始，流行DCC工具中导出这个格式的插件的开发几乎停止了。最近，COLLADA又被重新重视起来，主要原因是对WebGL的支持的需求——WebGL中缺乏内置的文格式(后面会更详细地介绍)。随后出现了一个新的开源项OpenCOLLADA(www.khronos.org/collada/wik… Max及Maya 2010年以后版本的导出工具、能导出干净并兼容的COLLADA格式。

尽管增强对COLLADA格式的支持对于3D内容制作流程是一大好处，但它有个问题。正如我们在前面例子中看到的，COLLADA非常冗长。这个格式是设计用来保存更多数据的，而不是为了快速下载和解析。因此，Khronos组织开始开发一个新的格式，既保留了COLLADA中的优秀特性——能完全表示丰富的3D动画场景，又同时考虑了Web传输问题，这个格式是glTF。

3. glTF：一个用于WebGL、OpenGL ES及OpenGL应用的新格式

随着WebGL的逐渐流行，它带来了一个需要Web开发者解决的问题，那就是如何从3D DCC工具中导出完整的信息到Web应用中。类似OBJ那样的单一网格文本格式正以表示一个物体，但它不包含场景图结构、光源、相机及动画。COLLADA的功能完善,但是我们在前面一节中看到，它很冗长，而且它是XML格式的，需要大量CPU计算解析并转成可以被WebGL渲染的数据结构。我们需要一个紧凑、适合Web的格式，它在渲染前只需要进行最少的额外处理，类似3D领域的JPEG格式。

在2012年的夏天，Fabrice Rohinet(摩托罗拉工程师、Khronos COLLADA工作组主席)开始开发拥有COLLADA功能的3D格式，它比COLLADA更紧凑，而且更适合WebGL。最开始这个项目被称为COLLADA2JSON，它的想法是将笨重的XML语法转成轻量级的JSON。从此以后,它开始走上了属于自己的发展道路。Khronos COLLADA工作组中的其他成员也加入进来了，包括我、COLLADA的创造者Remi Arnaud，以及Patrick Cozzi(防御软件供应商AGI的工程师)。我们的工作从简单地转换和优化COLLADA格式，变成了从头设计一种新的格式，用于基于OpenGL的Web及移动应用。于是glTF(Graphics Library Transmission Format，图形库传输格式)诞生了。

glTF以COLLADA的全功能特性作为出发点，但它是一种全新的格式。COLLADA的特性被用作支持何总图形特性的参考，使用细节完全不同。glTF使用JSON来描述场景结构及高层信息(比如相机和光源)，使用二进制格式来描述详细的数据，比如顶点、法线、颜色和动画。glTF的二进制格式被设计为能直接加载到WebGL的缓冲中(比如Int32Array和FloatArray这样的类型数组)。加载一个glTF文件的流程可以简单地描述如下：

(1)读取一个简单的JSON包装文件；

(2)通过Ajax加载外部二进制文件；

(3)创建一些类型数组；

(4)调用WebGL绘制内容的方法来渲染。

当然，实际上这样看起来更复杂一点。但它比下载和解析XML文件、将JavaScript Number类型转成类型数组高效多了。glTF可以保证文件体积小且加载速度快，这些都是创建高性能Web及移动应用中的关键素。

例8-6展示了一个典型glTF场景的JSON语法，这个场景就是著名的COLLADA鸭模型。注意，它的语法结构类似COLLADA，首先出现的是库，在最后定义场景图的结构，里面会引用这些库。也就只有这些相同之处了。glTF省掉了运行时非必需的信息,取而代之通过定义结构来让WebGL和OpenGL快速加载。glTF定义了属性(顶点位置、法线颜色、纹理坐标等)的详细信息，这些信息用于在可编程着色器中渲染物体。使用这些属性信息，glTF应用可以如实渲染任意网格，即便它自己没有复杂的材质系统。

除了JSON文件，glTF还会引用一个或多个存储了丰富数据(比如网格和动画的顶点数据)的二进制文牛(后缀是.bin)，这些二进制文件中的数据结构被称为缓冲(buffer)和缓冲视图(buffer view)。使用这种方式，我们可以流式下载、增量下载，或者一次性加载glTF内容，这决于应用需求。

例8-6：glTFJSON文件格式的例子

{

"animations": {},

"asset": {

"generator": "collada2gltf 0.1.0"

},

"attributes": {

"attribute_22": {

"bufferView": "bufferView_28",

"byteOffset": 0,

"byteStride": 12,

"count": 2399,

"max": [

96.1799,

163.97,

53.9252

],

"min": [

-69.2985,

9.92937,

-61.3282

],

"type": "FLOAT_VEC3"

},

... 此处省略其他顶点属性

"bufferViews": {

"bufferView_28": {

"buffer": "duck.bin",

"byteLength": 76768,

"byteOffset": 0,

"target": "ARRAY_BUFFER"

},

"bufferView_29": {

"buffer": "duck.bin",

"byteLength": 25272,

"byteOffset": 76768,

"target": "ELEMENT_ARRAY_BUFFER"

}

},

"buffers": {

"duck.bin": {

"byteLength": 102040,

"path": "duck.bin"

}

},

"cameras": {

"camera_0": {

"aspect_ratio": 1.5,

"projection": "perspective",

"yfov": 37.8492,

"zfar": 10000,

"znear": 1

}

},

... 其他高级物体，如材质和灯光

... 最后是场景图(scene graph)

"nodes": {

"LOD3sp": {

"children": [],

"matrix": [

... 矩阵数据

],

"meshes": [

"LOD3spShape-lib"

],

"name": "LOD3sp"

},
复制代码

尽管glTF设计的重点是为OpenGL提供紧凑和高效的数据格式，但设计者还是做出了一定的折中，保留了DCC工具所创建的部分必备数据，比如动画、相机和光源。当前版本的glTF(1.0)支持以下特性。

l 网格

多边形网格由一个或多个几何些基元组成。网格在JSON文件中定义，然后引用一个或多个包含顶点数据的二进制数据文件。

l 材质和着色器

材质可以表示为高层的通用结构(Blinn、Phong、Lambert)，或者在GLSL顶点着色器和片段着色器中实现，作为外部文件被glTF文件引用。

l 光源

通用光源类型(定向光、点光源、聚光灯和环境光)，可以在JSON文件中用高层结构表示。

l 相机

glTF定义了通用的相机类型，比如透视相机和正交相机。

l 场景图结构

场景使用一个层级图来表示，有许多节点(即网格、相机和光源)。

l 变换层级

场景图中的每个节点都有一个关联的变换矩阵。每个节点都可以包含子节点，子节点继承父节点的变换信息。

l 动画

glTF定义了用来表示基于关键帧、蒙皮及变形动画的数据结构。

l 外部媒体

图片和视频可以用作纹理贴图的外部文件，使用URL来引用。

glTF项目尽管是在Khronos组织的赞助下开发的，但它完全公开，允许任何人贡献代码。在GitHub中有一个源码仓库，其中包含可以用的查看器、例子及规范本身。glTF团队信奉的理念是功能需要先在代码中实现，然后才写到规范中。该团队已经开发了四个独立的glTF查看器，其中一个可以在Three.js中使用(我们待会将看到)。想要获取更多信息，请访问Khronos glTF的主页(gltf.gl/)。

4. Autodesk FBX

还有一个全功能场景格式值得一提，至少要顺带介绍一下。FBX格式是一种文本格式，归属于Autodesk公司，最早它由Kaydara开发，用于MotionBuilder中。Autodesk收购Kaydara后，它开始在它的多个产品中使用FBX格式。就这样，FBX变成Autodesk产品(3ds Max、Maya和MotionBuider)内部交换数据的标准。

FBX是个富格式，支持许多3D和动画数据类型。和本章介绍的其他格式不同，FBX是私有的，被Autodesk完全控制。Autodesk编写这个格式的文档，并提供C++和Python的SDK来读写FBX文件。但这些SDK需要产品许可证，有些可能会很贵。目前有一些不使用SDK开发的FBX导入导出工具，比如Blender，但不清楚这样使用在FBX授权协议中是否合法。

因为这个格式是私有的，而且许可协议模糊，所以最好不使用FBX。但另一方面，它是在业界顶尖工具中所使用的非常强大的技术，因此值得一看。想要获取更多信息，请访问FBX的主页(www.autodesk.com/products/fb…

文章版权归作者所有，未经允许请勿转载。

THE END

阅读