This tutorial is part of a Collection: 04. DirectX 12 - Braynzar Soft Tutorials
19418
views
04. Drawing!
We will start drawing geometry onto the screen in this tutorial. We will learn more about Pipeline State Objects (PSO) and Root Sigantures. We will also learn about resource heaps, viewports, scissor rectangles and vertices!
BzTut02.rar 29.55 kb
1870 downloads
####Resource Heaps#### Resource heaps are like descriptor heaps, but instead of storing descriptors, they store the actual data for the resources. They are a chunk of allocated memory, either on the GPU or on the CPU, depending on the type of heap. Unlike descriptor heaps, the maximum size of resource heaps depends on the type of the heap, available video(GPU) memory or system(CPU) memory. Data may include vertex buffers, index buffers, constant buffers or textures. There are three types of heaps: ##Upload Heaps## Upload Heaps are used to upload data to the GPU. The GPU has read access to this memory, while the CPU has write access. Your application will store the resource in this type of heap, such as a vertex buffer, then use the command list to copy the data from this heap to a default heap using the .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn899212(v=vs.85).aspx][UpdateSubresources()] function. ##Default Heaps## Default heaps are chunks of memory on the GPU. The CPU has no access to this memory, which makes accessing this memory in shaders very fast. This is where you will want to store the resources that your shaders use. To get resources in this heap from your application, you need to first create an upload heap, store the resource in the upload heap, then use the UpdateSubresources() function to copy the data from the upload heap to the default heap. Basically this function stores a command in a command list, and is executed once you call execute on the command queue. If the application is changing the resource often, like every frame, you will need to upload the new resource every time. In these cases it would be inneficient to copy the data to a default heap every time. Instead you would just use an upload heap only. For other resources that either do not change frequently, or only get changed by the GPU, you will use the default heap. ##Readback Heaps## Readback heaps are chunks of memory that the GPU can write to, and the CPU can read from. These might be statistics from the GPU or information about a screen capture. You can use a direct command queue to upload data to the GPU, which we do in these tutorials, but there is a copy queue, which you would then use a copy command list and a copy command allocator to upload data to the GPU. This is more efficient (but makes the code a little more complex) because while the command queue is executing commands to copy data from upload heaps to default heaps, the direct command list could be executing draw commands. ####Vertices and Input Layouts#### To draw geometry, the pipeline needs information about the geometry. The geometry, in the form of a list of *vertices* (and indices, explained in the next tutorial), are passed to the Input Assembler (IA) stage of the pipeline. The IA needs to know how to read the vertex data, which is where *Input Layouts* and *Primitive Topology* come in. ##Vertices## Vertices are what make up the geometry. Vertices will always have a position (even if it is not set at first). They are a point in space, and are used to define geometry like polygons, triangles, lines and points. A single vertex makes up a point, two vertices make up a line, 3 vertices make up a triangle, 4 make a quad and so on. +[http://www.braynzarsoft.net/image/100217][Three vertices make up a triangle] All solid objects are made up of triangles, which is the smallest surface. Direct3D only works with triangles for solid objects (points and lines can be worked with as well, but they would not make up a solid object, there would be no surfaces). A quad for example is made up of two triangles. A vertex position is defined by three values; x, y, and z. Generally in game programming, x is left and right, y is up and down, and z is depth. You will create a structure representing a vertex, and you will pass an array of these objects, called a vertex buffer, to the GPU and bind that vertex buffer to the IA. An example structure might look like this: struct ColorVertex { float x, y, z; // position float r, g, b, z; // color; } A vertex can also contain more information describing that part of a polygon. Information such as texture coordinates, color, normal coordinates, as well as weight information for animation. The rasterizer will interpolate these values across the polygon between each of the vertices. This means that the value at certain points on a polygon depend on how close it is to each vertex. +[http://www.braynzarsoft.net/image/100218][Interpolation] ##Primitive Topology## Primitive Topology is used by the Input Assembler to know how to group the vertices and pass them throughout the pipeline. Primitive Topology type defines whether the vertices make up a list of points, lines, triangles or polygons. When creating a Pipeline State Object (PSO), we need to say what **Primitive Topology Type** the IA will assemble the vertices as. There are four actual primitive topology types in the .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770385%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396][D3D12_PRIMITIVE_TOPOLOGY_TYPE] enumeration, plus an undefined type, which is the default value for a PSO (if the undefined type is used in a PSO, you will not be able to draw anything): **D3D12_PRIMITIVE_TOPOLOGY_TYPE_UNDEFINED** - The default value, cannot draw anything if this is set **D3D12_PRIMITIVE_TOPOLOGY_TYPE_POINT** - each vertex is one point when drawn **D3D12_PRIMITIVE_TOPOLOGY_TYPE_LINE** - Two vertices at a time are grouped, a line between them is drawn **D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE** - Three vertices at a time are grouped, creating a triangle. By default triangle is filled in, otherwise if wireframe is set, lines between each of the 3 vertices is drawn **D3D12_PRIMITIVE_TOPOLOGY_TYPE_PATCH** - For tesselation. If the Hull Shader and Domain shader are set, then this must be the primitive topology type. You must also set the primitive topology adjacency and ordering in the command list to a .[https://msdn.microsoft.com/en-us/library/windows/desktop/ff728726(v=vs.85).aspx][D3D_PRIMITIVE_TOPOLOGY] enumeration type. The primitive topology adjacency and ordering set in the command list must be compatible with the primitive topology type set in the PSO. This is how the input assembler will order the vertices/indices when assembling the geometry. Examples are point lists, line lists, triangle lists, triangle strips, triangle strips with adjacency, etc. ##Input Layout## The input layout describes how the input assembler should read the vertices in the vertex buffer. It describes the layout of attributes a vertex has, such as position and color, and how big they are. This is so that the Input Assembler knows how to pass the data to the pipeline stages, where a vertex starts, and where a vertex ends. An example of an input layout might look like this: D3D12_INPUT_ELEMENT_DESC inputLayout[] = { { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, 12, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } }; This input layout describes a vertex that has a position consisting of three 32-bit floating point values, and a color consisting of four 32-bit floating point values. When this is set as the input layout, the IA knows the size of each vertex is 28 bytes, so when it goes to the next vertex, it increments the current address 28 bytes. It knows the first 12 bytes are the position, and the second 16 bytes are the color. We will talk a bit more about creating input layouts when we get to it in code. ####Vertex and Pixel Shaders#### Although a PSO only requires a vertex shader to be defined, to actually draw anything onto a render target we will also need a pixel shader. We will be making two very simple shader functions for these two programmable pipeline stages. In DirectX, shaders are programmed in a language called HLSL, or High Level Shader Language. This language is similar to the C language, so its pretty easy to understand, as long as you understand what the shader stage takes as input and returns as output. To use a shader, the shader function is compiled to something called bytecode. Bytecode is a chunk of data that we pass to the GPU. The GPU will run that bytecode for that shader stage. In this tutorial we are compiling at runtime so that we can debug easier, but in a real game, you will compile the shader code using a program called fxc.exe (visual studio will automatically compile the shader files, which you can turn off if you want by right clicking the shader file, going to properties, and setting excluded from build to yes). When compiling with fxc.exe, the default output is a .cso file, or Compiled Shader Object file. You will notice I use the function name "main" for all the shader functions. When compiling a shader, by default fxc.exe looks for a function called "main" inside the shader file. You can change it to whatever you want. To create a shader, in your solution explorer, you can right click the resources folder if you want, or you could create a shaders folder, click add, then new item, then in the window that pops up, under Visual C++, there is a tab called "HLSL". click on that and on the right side of the screen you will see a list of possible shaders. choose the shader you want then click add (after changing the name if you want). Visual Studio will then create sample code for the shader you selected. By default visual studio will compile this shader using fxc.exe. you can turn that off if you want to do it yourself or are doing it during runtime, which i mentioned above. ##Vertex Shader## The vertex shader is required to make a PSO successfully. It is the only shader required to be set in a PSO. It is not often that you will only set a vertex shader though, because if a pixel shader is not set as well, nothing will be drawn to your render target. The reason why it is possible to set only a vertex shader and nothing else is because of the stream output, which outputs data from the GPU after the vertex shader, unless a geometry shader is set, then it outputs after the geometry shader. The vertex shader takes in a single vertex, and outputs a single vertex. Usually the vertex being passed in has a position relative to its own model space. You will need to transform that position to screen space, or viewport space, by first multiplying it by its world matrix, then the view matrix, and finally the projection matrix, WVP for short, and we will talk about this in an upcoming tutorial. For this tutorial, we have a very simple vertex shader, which does no computation, all it does is return the input vertex position. The vertex position we give it is already in screen space, so we do not need to do anything more. // simple vertex shader float4 main( float3 pos : POSITION ) : SV_POSITION { return float4(pos, 1.0f); } ##Pixel Shader## The pixel shader works with pixel **fragments**. a pixel fragment is a possible candidate for the final pixel position on the render target. Not all pixel fragments that go through a pixel shader will end up on the render target, and that is because of depth/stencil buffers. you might draw an object in the back of the scene, all the pixels go through the pixel shader for that object, but then you draw a wall in front of that object. the walls pixels will be the final pixels drawn on the render target instead of the objects pixels (if there is nothing between the camera and the wall). The pixel shader can take in whatever the vertex shader returned, and outputs a float4 color, in the format of "Red, Green, Blue, Alpha", RGBA. We are creating a simple pixel shader. It takes no input, and returns the color green: // simple pixel shader float4 main() : SV_TARGET { return float4(0.0f, 1.0f, 0.0f, 1.0f); // Red, Green, Blue, Alpha } ####Pipeline State Objects (PSO)#### .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn899196%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396][MSDN Managing Graphics Pipeline State in Direct3D 12] A pipeline state object is an object that contains shaders and pipeline states. A game will create many of these during initialization time. These are part of what makes Direct3D 12 perform so much better than any previous DirectX iteration. The reason is because we are able to create many pipeline state objects at initialization time, then instead of setting individual states many times throughout a frame, we set a pipeline state object which is already initialized. In one frame, you will set a PSO every time the pipeline state needs to change. The pipeline states that a PSO sets contain: - **Shader Bytecode** - *The shader functions enabled in the graphics pipeline* - **Input Layout** - *The format of your vertex structure* - **Primitive Topology Type** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770385(v=vs.85).aspx][D3D12_PRIMITIVE_TOPOLOGY_TYPE] enumeration saying whether the Input Assembler should assemble the geometry as Points, Lines, Triangles, or Patches (used for tesselation). This is different from the adjacency and ordering that is set in the command list (triangle list, triangle strip, etc).* - **Blend State** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770339%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396][D3D12_BLEND_DESC] structure. This describes the blend state that is used when the output merger writes a pixel fragment to the render target.* - **Rasterizer State** - *Rasterizer state such as cull mode, wireframe/solid rendering, antialiasing, etc. A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770387%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396][D3D12_RASTERIZER_DESC] structure.* - **Depth/Stencil State** - *Used for depth/stencil testing. One of the next tutorials will explain depth buffer testing, a later one will explain stencil testing.* - **Render Targets** - *This is a list of render targets that the Output Merger should write to* - **Number of render targets** - *It is possible to write to more than one render target at a time.* - **Multi-Sampling** - *Parameters explaining multi-sampling count and quality. This must be the same as the render target.* - **Stream Output Buffer** - *A buffer that the stream output writes too. The stream output writes to the stream output buffer after the Geometry shader, if it's set, otherwise writes to the stream output buffer after the Vertex shader* - **The Root Signature** - *A root signature is basically a parameter list of data that the shader functions expect. The shaders must be compatible with the Root Signature.* Some pipeline states can be set outside a PSO. These states are: - Resource Binding (.[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986882(v=vs.85).aspx][index buffers], .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986883(v=vs.85).aspx][vertex buffers], .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986886(v=vs.85).aspx][stream output targets], .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986884(v=vs.85).aspx][render targets], .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903908(v=vs.85).aspx][descriptor heaps]) - .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903900(v=vs.85).aspx][Viewports] - .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903899(v=vs.85).aspx][Scissor Rectangles] - .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903886(v=vs.85).aspx][Blend Factor] - .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903887(v=vs.85).aspx][Depth/Stencil reference value] - .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903885(v=vs.85).aspx][IA primitive topology order/adjacency] ####Root Signatures#### An overview of root descriptors were talked about in the last tutorial, .[http://www.braynzarsoft.net/viewtutorial/q16390-03-initializing-directx-12][Initializing Direct3D 12]. Basically, a root signature defines the data that the shaders in the current PSO will use. These parameters are either *Root Constants*, *Root Descriptors*, or *Descriptor Tables*. The parameters in the root signature that *define* the data that the shaders will use (entries in the root signature) are called **Root Parameters**. The actual data *values* that change during runtime are called **Root Arguments**. Like PSO's, Root Signatures are created at initialization time. Changing the root signature can be expensive, so you want to group PSO's together by root signature when possible (so that you are not constantly changing the root signature back and forth). You do not have to keep track of a fence or anything when changing the root arguments between draw calls. The reason is because root arguments are automatically versioned, so each draw call gets their own root signature state. This is different than uploading resources to the GPU though. When uploading resources to the GPU, you must check yourself that the upload (or copy) has been completed before using that resource by setting and watching a fence. In this tutorial, we are using the root signature only to say that we want to use the Input Assembler. We do this by specifying the D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT flag when initializing our root signature. By default D3D12_ROOT_SIGNATURE_FLAG_NONE is the only flag set on the root signature, which means that we would not be able to pass vertex data to the vertex shader. It is possible to execute draw calls without binding a vertex buffer. In these cases we just say how many vertices to draw, then we can either set them up in the vertex shader using a system semantic like SV_VertexID to identify the current vertex number. We can use the geometry shader or tesselation to generate more geometry from there. Could be used in particle effect engines. In later tutorials we will set up the root descriptor's root constants (for things like view and projection matrices), and root descriptors and descriptor tables for things like texturing. ####Viewports#### The viewport specifies the area of the render target which the scene will be drawn on to. There are six values to be set in the viewport; top left X, top left Y, width, height, min Z, and max Z. The top left x and y are relative to the top left of the render target in pixels. The width and height are in pixels, defining the right and bottom of the viewport relative to the top left x and y. Finally the min Z and max Z define the Z range of the scene to be drawn. Anything outside this range will not be drawn. The viewport converts the view space to screen space, where screen space is in pixels, and view space is between -1.0 to 1.0 from left to right, and from 1.0 to -1.0 from top to bottom. Positions coming out of the vertex shader are in view space. Anything in the space defined by the viewport (anything to be rendered) needs to be between -1.0 to 1.0 on the x axis, 1.0 to -1.0 on the y axis, and between min Z and max Z on the Z axis. View space is the space that comes out of the vertex and geometry shader. Anything between (-1.0, -1.0) and (1.0, 1.0) will be in view space. +[http://www.braynzarsoft.net/image/100214][View Space] The viewport "stretches" view space to the defined screen space's width and height in pixels. +[http://www.braynzarsoft.net/image/100215][Screen Space (View space stretched to screen space defined by viewport)] Viewports do not have to cover the entire screen space. You could for example have two viewports, one for the left side of the screen, and one for the right side of the screen if you had a multi-player game. Another example might be a radar or mini map on the screen. You would render the radar or mini map to a small section of the render target, defined by the viewport. ####Scissor Rectangles#### The scissor rect specifies the area which will be drawn onto. Anything (pixel fragments) outside the scissor rect will be cut and not even make it to the pixel shader. +[http://www.braynzarsoft.net/image/100216][Scissor Rect] The scissor rect has four members; left, top, right, and bottom. These are in pixels and relative to the top left of the render target. ####The Code#### Lets get into the code now. Just to be clear, I do not code like this in real life. The only reason I have decided to write code like i do in these tutorials is because I feel it makes it easier to see how directx works without needing to jump between classes and files. When you are making your own application, try to stay away from globals. ####New Globals#### Here are the new globals we will be using in this tutorial. First we have a PSO. This PSO will contain our default pipeline state. In this tutorial, we only have one pipeline state object, but in a real application you will have many. Next is the root signature. We will use this root signature to say that the Input Assembler will be used, which means that we will bind a vertex buffer containing information about each vertex such as position. Each vertex in the vertex buffer will be passed to the vertex shader. Next is our viewport. We only have one viewport because we will be drawing to the entire render target. After our viewport is a scissor rect. The scissor rect will say where to draw and where not to draw. I have noticed that one of my computers works without setting a scissor rect, while the other one will draw nothing without a scissor rect defined. We have a ID3D12Resource, which is where we will store our vertex buffer. This resource will be a default heap that we upload the vertex buffer to from a temporary upload heap. Finally we have a vertex buffer view. This view simply describes the address, stride(size of each vertex) and total size of our vertex buffer. ID3D12PipelineState* pipelineStateObject; // pso containing a pipeline state ID3D12RootSignature* rootSignature; // root signature defines data shaders will access D3D12_VIEWPORT viewport; // area that output from rasterizer will be stretched to. D3D12_RECT scissorRect; // the area to draw in. pixels outside that area will not be drawn onto ID3D12Resource* vertexBuffer; // a default buffer in GPU memory that we will load vertex data for our triangle into D3D12_VERTEX_BUFFER_VIEW vertexBufferView; // a structure containing a pointer to the vertex data in gpu memory // the total size of the buffer, and the size of each element (vertex) ####The Vertex Structure#### We will need to define a vertex structure. We will create a number of the vertex structure objects when we create the vertex buffer. A vertex buffer is an array of vertices. In this tutorial, we only have a vertex position, which is defined by 3 floating point values, x, y and z. We are using the directx math library, so we will use the XMFLOAT3 structure to hold our vertex position: struct Vertex { XMFLOAT3 pos; }; ####Create a Root Signature####.[https://msdn.microsoft.com/en-us/library/windows/desktop/dn859357(v=vs.85).aspx][MSDN Creating a Root Signature] A root signature is stored in a *ID3D12RootSignature* interface. To create a root signature, we will fill out a *CD3DX12_ROOT_SIGNATURE_DESC* structure, defined in the extended dx12 header. This is a wrapper around the .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986747(v=vs.85).aspx][D3D12_ROOT_SIGNATURE_DESC] structure: typedef struct .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986747(v=vs.85).aspx][D3D12_ROOT_SIGNATURE_DESC] { UINT NumParameters; const D3D12_ROOT_PARAMETER *pParameters; UINT NumStaticSamplers; const D3D12_STATIC_SAMPLER_DESC *pStaticSamplers; D3D12_ROOT_SIGNATURE_FLAGS Flags; } D3D12_ROOT_SIGNATURE_DESC; - **NumParameters** - *This is the number of slots our root signature will have. A slot is a **root parameter**. A root parameter may be a root constant, root descriptor, or a descriptor table.* - **pParameters** - *This is an aray of .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn879477(v=vs.85).aspx][D3D12_ROOT_PARAMETER] structures which define each of the root parameters this root signature contains. We will discuss these in a later tutorial.* - **NumStaticSamplers** - *This is the number of static samplers the root signature will contain.* - **pStaticSamplers** - *An array of .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986748%28v=vs.85%29.aspx][D3D12_STATIC_SAMPLER_DESC] structures. These structures define static samplers.* - **Flags** - *A combination of .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn879480(v=vs.85).aspx][D3D12_ROOT_SIGNATURE_FLAGS] OR'ed (|) together.* These are flags that may be used when creating the root signature. The default flag used when creating a root signature is D3D12_ROOT_SIGNATURE_FLAG_NONE. In this tutorial, we are only using the root signature to tell the pipeline to use the Input Assembler, so that we can pass a vertex buffer through the pipeline. This root signature will not have any root parameters for this tutorial, although we will add root parameters to it in a later tutorial. typedef enum .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn879480(v=vs.85).aspx][D3D12_ROOT_SIGNATURE_FLAGS] { D3D12_ROOT_SIGNATURE_FLAG_NONE = 0, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT = 0x1, D3D12_ROOT_SIGNATURE_FLAG_DENY_VERTEX_SHADER_ROOT_ACCESS = 0x2, D3D12_ROOT_SIGNATURE_FLAG_DENY_HULL_SHADER_ROOT_ACCESS = 0x4, D3D12_ROOT_SIGNATURE_FLAG_DENY_DOMAIN_SHADER_ROOT_ACCESS = 0x8, D3D12_ROOT_SIGNATURE_FLAG_DENY_GEOMETRY_SHADER_ROOT_ACCESS = 0x10, D3D12_ROOT_SIGNATURE_FLAG_DENY_PIXEL_SHADER_ROOT_ACCESS = 0x20, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_STREAM_OUTPUT = 0x40 } D3D12_ROOT_SIGNATURE_FLAGS; To tell the pipeline to use the Input Assembler, we must create a root signature with the D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT flag. Without this flag, the input assembler will not be used, and calling draw(numberOfVertices) will call the vertex shader numberOfVertices times with empty vertices. Basically we can create the vertices in the vertex shader using the vertex index. We can pass these to the geometry shader to create more geometry. By not using the D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT flag, we save one slot in the root signature that can be used for a root constant, root descriptor, or descriptor table. This optimization is minimal. If D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT is specified, we must create and use an input layout. In this tutorial, we are going to pass a vertex buffer through the pipeline, so we specify the D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT flag. The other flags are are used to deny stages of the pipeline access to the root signature. When using resources or the root signature, you want to deny any shaders that do not need access so that the GPU can optimize. Allowing all shaders access to everything slows down performance. A lot of the direct3d 12 functions allow you to pass an ID3DBlob pointer to store error messages in. You can pass a nullptr if you do not care to read the error. You can get a null terminated char array from the ID3DBlob returned from these function by calling the GetBufferPointer() method of the blob. This char array will contain the error message. we will see this in a below section. In this tutorial we will define and create the root signature in code at runtime. You could however .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn913202(v=vs.85).aspx][define the root signature in HSLS] instead. The first thing we do here is fill out a CD3DX12_ROOT_SIGNATURE_DESC. We want the input assembler to be used so we specify the D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT flag. Once we have created the description, we "serialize" the root signature into bytecode. We will use this bytecode to create a root signature object. After we have root signature bytecode, we create the root signature by calling the CreateRootSignature() method of our device. HRESULT .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn899182%28v=vs.85%29.aspx][CreateRootSignature]( [in] UINT nodeMask, [in] const void *pBlobWithRootSignature, [in] SIZE_T blobLengthInBytes, REFIID riid, [out] void **ppvRootSignature ); // create root signature CD3DX12_ROOT_SIGNATURE_DESC rootSignatureDesc; rootSignatureDesc.Init(0, nullptr, 0, nullptr, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT); ID3DBlob* signature; hr = D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, &signature, nullptr); if (FAILED(hr)) { return false; } hr = device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(&rootSignature)); if (FAILED(hr)) { return false; } ####Create Vertex and Pixel Shaders#### Shaders in Direct3D are written in a language called **H**igh **L**evel **S**hading **L**anguage, or more simply, **HLSL**. To create a shader in visual studio, open your solution explorer, right click on resources (you can put the shader file anywhere, but since there is already a resources folder we will store them there), hover over "Add", then select "New Item...". +[http://www.braynzarsoft.net/image/100219][Add New Item] The add new item window will open. On the left panel, expand "Visual C++". You will see an item called "HLSL". Select that item, and on the right side, choose the shader you want to create. In this tutorial we will need a Vertex Shader, as well as a Pixel Shader. Once you select the shader you want, give it a name at the bottom of the window, then click "Add". +[http://www.braynzarsoft.net/image/100220][Add Shader File] Once you click add, the shader will open up, and you will be presented with the most basic code for that shader. All shaders functions created this way will be named "main". You can change this name, but you must also be sure to change it in your c++ code to look for the right function name. fxc.exe will also by default look for a function named "main" when compiling, so if you were to change this function name and try to compile your code, you would most likely be presented with an error like: FXC : error X3501: 'main': entrypoint not found To fix this error, you can either change the name of your shader function back to "main", or open the solution explorer, right click the shader file, click properties. The properties window will open. On the left panel, click on "HLSL Compiler". On the right window there is an option called "Entrypoint Name". By default the value is "main". Change this to the name of your shader function. When you build your program, the hlsl shader files will by default be compiled as well using fxc.exe. The output of these files are "compiled shader object" files with a default extension (can be changed in properties) .cso, which contain the shader function *bytecode*. You can tell visual studio to not compile these files as well if you want in the properties window, by going to the "General" tab, and setting the option "Excluded From Build" to "Yes". Alright, lets talk about the tutorial code now. When creating a shader, you must provide a pointer to an ID3DBlob containing the shader bytecode. When debugging, you may want to compile the shader files during runtime to catch any errors in the shaders. We can compile the shader code during runtime using the D3DCompileFromFile() function. This function will compile the shader code to shader bytecode, and store it in an ID3DBlob object. When you release your game, you would want to compile the shaders to compiled shader object files, and load those in rather than compiling the shader code during runtime at initialization. HRESULT WINAPI .[https://msdn.microsoft.com/en-us/library/windows/desktop/hh446872(v=vs.85).aspx][D3DCompileFromFile]( in LPCWSTR pFileName, in_opt const D3D_SHADER_MACRO pDefines, in_opt ID3DInclude pInclude, in LPCSTR pEntrypoint, in LPCSTR pTarget, in UINT Flags1, in UINT Flags2, out ID3DBlob ppCode, out_opt ID3DBlob ppErrorMsgs ); - **pFileName** - *This is the filename that contains the shader code* - **pDefines** - *This is an array of .[https://msdn.microsoft.com/en-us/library/windows/desktop/ff728732(v=vs.85).aspx][D3D_SHADER_MACRO] structures which define shader macros. Set this to nullptr if no macros are used.* - **pInclude** - *This is a pointer to a .[https://msdn.microsoft.com/en-us/library/windows/desktop/ff728746(v=vs.85).aspx][ID3DInclude] interface which is used to handle **#include**s in the shader code. If you have any #include lines in your shader code and set this to nullptr, your shader will fail to compile.* - **pEntrypoint** - *This is the name of the shader function. In this tutorial we keep the default shader function name "main".* - **pTarget** - *This is the shader model you would like to use when compiling your shader. We are using shader model 5.0, so for example when we compile our vertex shader, we set this to **"vs_5_0"**. The pixel shader will be set to **"ps_5_0"**.* - **Flags1** - *These are .[https://msdn.microsoft.com/en-us/library/windows/desktop/gg615083(v=vs.85).aspx][compile option flags] OR'ed together. For this tutorial, and for debugging, we use the D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION flags.* - **Flags2** - *More compile flags used for effect files. If we are not compiling an effect file, this parameter will be ignored, and we can set it to 0* - **ppCode** - *This is a pointer to an ID3DBlob that will point to the compiled shader bytecode.* - **ppErrorMsgs** - *This is a pointer to an ID3DBlob that will hold any errors that occur when compiling the shader code.* If an error occured compiling the shader, we can access it with the ID3DBlob we passed into the last parameter. The message is a NULL terminated c string (char array). We can get the error message by casting the return from the GetBufferPointer() of the ID3DBlob containing the error. We can output the error in the output window of visual studio with the OutputDebugString() function. When we create our PSO, we need to provide a .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770405(v=vs.85).aspx][D3D12_SHADER_BYTECODE] structure which contains the shader bytecode and the size of the shader bytecode. We can get a pointer to the shader bytecode with the GetBufferPointer() method of the ID3DBlob we passed to D3DCompileFromFile(), and we can get the size of the bytecode with the GetBufferSize() method of the ID3DBlob. // create vertex and pixel shaders // when debugging, we can compile the shader files at runtime. // but for release versions, we can compile the hlsl shaders // with fxc.exe to create .cso files, which contain the shader // bytecode. We can load the .cso files at runtime to get the // shader bytecode, which of course is faster than compiling // them at runtime // compile vertex shader ID3DBlob* vertexShader; // d3d blob for holding vertex shader bytecode ID3DBlob* errorBuff; // a buffer holding the error data if any hr = D3DCompileFromFile(L"VertexShader.hlsl", nullptr, nullptr, "main", "vs_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &vertexShader, &errorBuff); if (FAILED(hr)) { OutputDebugStringA((char*)errorBuff->GetBufferPointer()); return false; } // fill out a shader bytecode structure, which is basically just a pointer // to the shader bytecode and the size of the shader bytecode D3D12_SHADER_BYTECODE vertexShaderBytecode = {}; vertexShaderBytecode.BytecodeLength = vertexShader->GetBufferSize(); vertexShaderBytecode.pShaderBytecode = vertexShader->GetBufferPointer(); // compile pixel shader ID3DBlob* pixelShader; hr = D3DCompileFromFile(L"PixelShader.hlsl", nullptr, nullptr, "main", "ps_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &pixelShader, &errorBuff); if (FAILED(hr)) { OutputDebugStringA((char*)errorBuff->GetBufferPointer()); return false; } // fill out shader bytecode structure for pixel shader D3D12_SHADER_BYTECODE pixelShaderBytecode = {}; pixelShaderBytecode.BytecodeLength = pixelShader->GetBufferSize(); pixelShaderBytecode.pShaderBytecode = pixelShader->GetBufferPointer(); ####Create an Input Layout#### Here we create the input layout, which describes our vertices inside the vertex buffer to the input assembler. The input assembler will use the input layout to organize and pass vertices to the stages of the pipeline. To create an input layout, we fill out an array of D3D12_INPUT_ELEMENT_DESC structures, one for each attribute of the vertex structure, such as position, texture coordinates or color: typedef struct .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770377(v=vs.85).aspx][D3D12_INPUT_ELEMENT_DESC] { LPCSTR SemanticName; UINT SemanticIndex; DXGI_FORMAT Format; UINT InputSlot; UINT AlignedByteOffset; D3D12_INPUT_CLASSIFICATION InputSlotClass; UINT InstanceDataStepRate; } D3D12_INPUT_ELEMENT_DESC; - **SemanticName** - *This is the name of the parameter. The input assembler will associate this attribute to an input with the same semantic name in the shaders. This can be anything as long as the semantic name here matches one of the input parameters to the vertex shader.* - **SemanticIndex** - *This is only needed if more than one element have the same semantic name. For example, you have two elements with the semantic name "COLOR". Inside your shader you have two vertex inputs called color1 and color2. You would give these input parameters the semantic names "COLOR0", "COLOR1" respectively. This parameter would say which (COLOR0 or COLOR1) vertex input parameter to associate this attribute with.* - **Format** - *This is a .[https://msdn.microsoft.com/en-us/library/windows/desktop/bb173059(v=vs.85).aspx][DXGI_FORMAT] enumeration. This will define the format this attribute is in. For example, we have a position consisting of 3 floating point values, x, y, and z. A floating point value is 4 bytes, or 32 bits, so we set this argument to DXGI_FORMAT_R32G32B32_FLOAT, which says there are 3 float values in this attribute. This should then be mapped to a **float3** parameter in the shader.* - **.[https://msdn.microsoft.com/en-us/library/windows/desktop/bb205117(v=vs.85).aspx#Input_Slots][InputSlot]** - *You may bind multiple vertex buffers to the input assembler. Each vertex buffer is bound to a **slot**. We are only binding one vertex buffer at a time, so we set this to 0 (the first slot).* - **AlignedByteOffset** - *This is the offset in bytes from the beginning of the vertex structure to the start of this attribute. The first attribute will always be 0 here. We only have one attribute, position, so we set this to 0. But when we get to color, we will have a second attribute, color, which then we will need to set this to 12 for the color element. We will set the color element to 12 bytes offset, because we have 3 floats for the position, each of them is 4 bytes, so 4x3 is 12. We can also look at the format of the previous element, position for example uses the format DXGI_FORMAT_R32G32B32_FLOAT, which is 96 bits, or 12 bytes (96 / 8 = 12), because each byte is 8 bits.* - **InputSlotClass** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770376(v=vs.85).aspx][D3D12_INPUT_CLASSIFICATION] enumeration. This specifies if this element is per vertex or per instance. More about this when we get to instancing. For now, we will use D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA since we are not instancing.* - **InstanceDataStepRate** - *This is the number of instances to draw before going to the next element. If we set D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, we must set this to 0.* Once we create an input element array, we fill out a D3D12_INPUT_LAYOUT_DESC structure. This structure will be passed as an argument when we create a PSO. If we were not using the input assembler as defined in the root signature, there would be no need for an input layout for any PSO's that are associated with that root signature. We can get the size (number of elements) of an array in c++ by using sizeof(array), which gives us the total number of bytes the array has, and dividing it by the sizeof(element), or the size of an element inside the array. The D3D12_INPUT_LAYOUT_DESC contains the number of input elements, and the array of input elements. // create input layout // The input layout is used by the Input Assembler so that it knows // how to read the vertex data bound to it. D3D12_INPUT_ELEMENT_DESC inputLayout[] = { { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } }; // fill out an input layout description structure D3D12_INPUT_LAYOUT_DESC inputLayoutDesc = {}; // we can get the number of elements in an array by "sizeof(array) / sizeof(arrayElementType)" inputLayoutDesc.NumElements = sizeof(inputLayout) / sizeof(D3D12_INPUT_ELEMENT_DESC); inputLayoutDesc.pInputElementDescs = inputLayout; ####Create a Pipeline State Object (PSO)#### In a real application, you will usually end up with many PSO's. For this tutorial, we will only need one. To create a pipeline state object, we must fill out a D3D12_GRAPHICS_PIPELINE_STATE_DESC structure: typedef struct .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770370(v=vs.85).aspx][D3D12_GRAPHICS_PIPELINE_STATE_DESC] { ID3D12RootSignature *pRootSignature; D3D12_SHADER_BYTECODE VS; D3D12_SHADER_BYTECODE PS; D3D12_SHADER_BYTECODE DS; D3D12_SHADER_BYTECODE HS; D3D12_SHADER_BYTECODE GS; D3D12_STREAM_OUTPUT_DESC StreamOutput; D3D12_BLEND_DESC BlendState; UINT SampleMask; D3D12_RASTERIZER_DESC RasterizerState; D3D12_DEPTH_STENCIL_DESC DepthStencilState; D3D12_INPUT_LAYOUT_DESC InputLayout; D3D12_INDEX_BUFFER_STRIP_CUT_VALUE IBStripCutValue; D3D12_PRIMITIVE_TOPOLOGY_TYPE PrimitiveTopologyType; UINT NumRenderTargets; DXGI_FORMAT RTVFormats[8]; DXGI_FORMAT DSVFormat; DXGI_SAMPLE_DESC SampleDesc; UINT NodeMask; D3D12_CACHED_PIPELINE_STATE CachedPSO; D3D12_PIPELINE_STATE_FLAGS Flags; } D3D12_GRAPHICS_PIPELINE_STATE_DESC; I'm pointing out which parameters are required, I had a tough time figuring it out. Even though some of the parameters may say not required, such as InputLayout, they ARE required if you are using the input assembler (your code may still run without them, but nothing will be drawn). - **pRootSignature** - *REQUIRED. A pointer to our root signature.* - **VS** - *REQUIRED. A pointer to the vertex shader bytecode. (used to manipulate vertices, most commonly converting them from object space, to world space, to projection space, to view space)* - **PS** - *NOT REQUIRED. A pointer to the pixel shader bytecode. (used to draw pixel fragments)* - **DS** - *NOT REQUIRED. A pointer to the domain shader bytecode (used for tesselation)* - **HS** - *NOT REQUIRED. A pointer to the hull shader bytecode (also used for tesselation)* - **GS** - *NOT REQUIRED. A pointer to the geometry shader (used to create geometry)* - **StreamOutput** - *NOT REQUIRED. Used to send data from the pipeline (after geometry shader, or after vertex shader if geometry shader is not defined) to your app.* - **BlendState** - *REQUIRED. This is used for blending, such as transparency. For now we have a default blend state, but we will explain this more in a later tutorial.* - **SampleMask** - *REQUIRED. This has to do with multi-sampling. 0xffffffff means point sampling is used. This will be explained in a later tutorial.* - **RasterizerState** - *REQUIRED. This is the state of the rasterizer. We will use a default state for now, but will have a tutorial on this later.* - **DepthStencilState** - *NOT REQUIRED. This is the state of the depth/stencil buffer. Again this will be explained in a later tutorial.* - **InputLayout** - *NOT REQUIRED. A D3D12_INPUT_LAYOUT_DESC structure defining the layout of a vertex.* - **IBStripCutValue** - *NOT REQUIRED. A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986732(v=vs.85).aspx][D3D12_INDEX_BUFFER_STRIP_CUT_VALUE] enumeration. This is used when a triangle strip topology is defined.* - **PrimitiveTopologyType** - *REQUIRED. a D3D12_PRIMITIVE_TOPOLOGY_TYPE defining the primitive topology that the vertices are put together (point, line, triangle, patch).* - **NumRenderTargets** - *REQUIRED. This is the number of render target formats in the RTVFormats parameter.* - **RTVFormats[8]** - *REQUIRED. An array of DXGI_FORMAT enumerations explaining the format of each render target. Must be the same format as the render targets used.* - **DSVFormat** - *NOT REQUIRED. An array of DXGI_FORMAT enumerations explaining the format of each depth/stencil buffer. Must be the same format as the depth stencil buffers used.* - **SampleDesc** - *REQUIRED. The sample count and quality for multi-sampling. Must be the same as the render target* - **NodeMask** - *NOT REQUIRED. A bit mask saying which GPU adapter to use. If you are only using one GPU, set this to 0.* - **CachedPSO** - *NOT REQUIRED. This is a cool parameter. You can cache PSO's, such as into files, so the next time your initialize the PSO, compilation will happen much much faster. This is a .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn914407(v=vs.85).aspx][D3D12_CACHED_PIPELINE_STATE] structure. The cached PSO is hardware dependant, which means if you are on one machine, you cannot share the cached PSO with another machine, otherwise you will get the D3D12_ERROR_ADAPTER_NOT_FOUND error code. Also if the graphics card driver was updated since the cached PSO, you will get the D3D12_ERROR_DRIVER_VERSION_MISMATCH error code when trying to compile the PSO. You might want to cache the PSO the first time your application was run, then each time its run again load in the cached PSO files. If you get either of the errors above, just load it without the cached PSO and save it to a file again.* - **Flags** - *NOT REQUIRED. A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986737(v=vs.85).aspx][D3D12_PIPELINE_STATE_FLAGS ]. The only options are D3D12_PIPELINE_STATE_FLAG_NONE and D3D12_PIPELINE_STATE_FLAG_TOOL_DEBUG. By default D3D12_PIPELINE_STATE_FLAG_NONE is set. The debug option will give extra information that is helpful when debugging.* Looking at the code, we fill out a D3D12_GRAPHICS_PIPELINE_STATE_DESC structure, then create (which includes compiling) the PSO. We create the PSO with the CreateGraphicsPipelineState() method of the device interface. // create a pipeline state object (PSO) // In a real application, you will have many pso's. for each different shader // or different combinations of shaders, different blend states or different rasterizer states, // different topology types (point, line, triangle, patch), or a different number // of render targets you will need a pso // VS is the only required shader for a pso. You might be wondering when a case would be where // you only set the VS. It's possible that you have a pso that only outputs data with the stream // output, and not on a render target, which means you would not need anything after the stream // output. D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; // a structure to define a pso psoDesc.InputLayout = inputLayoutDesc; // the structure describing our input layout psoDesc.pRootSignature = rootSignature; // the root signature that describes the input data this pso needs psoDesc.VS = vertexShaderBytecode; // structure describing where to find the vertex shader bytecode and how large it is psoDesc.PS = pixelShaderBytecode; // same as VS but for pixel shader psoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; // type of topology we are drawing psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM; // format of the render target psoDesc.SampleDesc = sampleDesc; // must be the same sample description as the swapchain and depth/stencil buffer psoDesc.SampleMask = 0xffffffff; // sample mask has to do with multi-sampling. 0xffffffff means point sampling is done psoDesc.RasterizerState = CD3DX12_RASTERIZER_DESC(D3D12_DEFAULT); // a default rasterizer state. psoDesc.BlendState = CD3DX12_BLEND_DESC(D3D12_DEFAULT); // a default blent state. psoDesc.NumRenderTargets = 1; // we are only binding one render target // create the pso hr = device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(&pipelineStateObject)); if (FAILED(hr)) { return false; } ####Create a Vertex Buffer#### A vertex buffer is a list of vertex structures. To use a vertex buffer, we must get it to the GPU, then bind that vertex buffer (resource) to the input assembler. To get a vertex buffer to the GPU, we have two options. The first option is to only use an upload heap, and upload the vertex buffer to the GPU each frame. This is slow since we need to copy the vertex buffer from ram to video memory every frame, so generally you will not want to do it this way. The second option, which is what you will usually do, is use an upload heap to upload the vertex buffer to the GPU, the *copy* the data from the upload heap to a *default heap*. The default heap will stay in memory until we overwrite it or release it. The second approach is the preferable one as you only need to copy the data once when you need it for a while, and is the way we will do it in this tutorial since it is the most efficient. We create a list of vertices and store them in the vList array. Here we create 3 vertices, defined already in view space, which make up a triangle. To create a resource heap, we use the CreateCommittedResource() method of the device interface: HRESULT .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn899178(v=vs.85).aspx][CreateCommittedResource]( [in] const D3D12_HEAP_PROPERTIES *pHeapProperties, D3D12_HEAP_FLAGS HeapFlags, [in] const D3D12_RESOURCE_DESC *pResourceDesc, D3D12_RESOURCE_STATES InitialResourceState, [in, optional] const D3D12_CLEAR_VALUE *pOptimizedClearValue, REFIID riidResource, [out, optional] void **ppvResource ); - **pHeapProperties** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn770373(v=vs.85).aspx][D3D12_HEAP_PROPERTIES] structure defining the heap properties. We will use the help structure .[https://msdn.microsoft.com/en-us/library/windows/desktop/mt186571(v=vs.85).aspx][CD3DX12_HEAP_PROPERTIES] to create the type of heap we want (upload and default heaps).* - **HeapFlags** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986730(v=vs.85).aspx][D3D12_HEAP_FLAGS] enumeration. We will not have any flags so we specify D3D12_HEAP_FLAG_NONE.* - **pResourceDesc** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903813(v=vs.85).aspx][D3D12_RESOURCE_DESC] structure describing the heap. We will use the helper structure .[https://msdn.microsoft.com/en-us/library/windows/desktop/mt186577(v=vs.85).aspx][CD3DX12_RESOURCE_DESC].* - **InitialResourceState** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn986744(v=vs.85).aspx][D3D12_RESOURCE_STATES ] enumeration. This is the initial state the heap will be in. For the upload buffer, we want it to be in a read state, so we specify D3D12_RESOURCE_STATE_GENERIC_READ for the state. For the default heap, we want it to be a copy destination, so we specify D3D12_RESOURCE_STATE_COPY_DEST. Once we copy the vertex buffer to the default heap, we will use a resource barrier to transition the default heap from a copy destination state to a vertex/constant buffer state* - **pOptimizedClearValue** - *A .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903795(v=vs.85).aspx][D3D12_CLEAR_VALUE] structure. If this was a render target or depth stencil, we could set this value to the value the depth/stencil buffer or render target would usually get cleared to. The GPU can do some optimizations to increase the performance of clearing the resource. Our resource is a vertex buffer, so we set this value to nullptr.* - **riidResource** - *Unique identifier for the type of the resulting resource interface.* - **ppvResource** - *A pointer to a pointer to the resource interface object we will can use this resource with.* We can set the name of the heap using the SetName() method of the interface. This is useful for graphics debugging, where we can distinguish resources by the name we give them. At the end of this tutorial we will take a quick look at the graphics debugger in visual studio. Once we create a vertex buffer (list of vertices), we create and upload and a default heap. The upload heap is used to upload the vertex buffer to the GPU, so we can copy the data to the default heap, which will stay in memory until we either overwrite it or release it. We can copy the data from the upload heap to the default heap using the UpdateSubresources() function. UINT64 inline .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn899213(v=vs.85).aspx][UpdateSubresources]( _In_ ID3D12GraphicsCommandList *pCmdList, _In_ ID3D12Resource *pDestinationResource, _In_ ID3D12Resource *pIntermediate, UINT64 IntermediateOffset, _In_ UINT FirstSubresource, _In_ UINT NumSubresources, _In_ D3D12_SUBRESOURCE_DATA *pSrcData ); - **pCmdList** - *This is the command list we will use to create this command, which will copy the contents of the upload heap to the default heap.* - **pDestinationResource** - *This is the destination of the copy command. In our case it will be the default heap, but it could also be a readback heap.* - **pIntermediate** - *This is where we will copy the data from. In this tutorial it is the upload heap, but it could also be a default heap.* - **IntermediateOffset** - *This is the number of bytes we want to offset the start from. We want the whole vertex buffer to be copied, so we will not offset at all, and set this to 0.* - **FirstSubresource** - *This is the index of the first subresource to start copying. We only have one, so we set this to 0.* - **NumSubresources** - *This is the number of subresources we want to copy. Again we only have one so we set this to 1.* - **pSrcData** - *This is a pointer to a .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn879485(v=vs.85).aspx][D3D12_SUBRESOURCE_DATA] structure. This structure contains a pointer to the memory where our data is, and the size in bytes of our resource.* Once we create the copy command, our command list stores it in its command allocator, waiting to be executed. Before we can use the vertex buffer stored in the default heap, we MUST make sure it is finished uploading and copying to the default heap. We close the command list, then execute it with the command queue. We increment the fence value for this frame and tell the command queue to increment the fence on the GPU side. Incrementing a fence is again a command, which will get executed once the command lists finish executing. After we execute the copy command, and set the fence, we need to fill out our vertex buffer view. This is a .[https://msdn.microsoft.com/en-us/library/windows/desktop/dn903819(v=vs.85).aspx][D3D12_VERTEX_BUFFER_VIEW] structure which contains the GPU address, stride (size of vertex structure in bytes) and total size of the buffer. We will use this structure to bind the vertex buffer to the IA when we start drawing our scene. // Create vertex buffer // a triangle Vertex vList[] = { { { 0.0f, 0.5f, 0.5f } }, { { 0.5f, -0.5f, 0.5f } }, { { -0.5f, -0.5f, 0.5f } }, }; int vBufferSize = sizeof(vList); // create default heap // default heap is memory on the GPU. Only the GPU has access to this memory // To get data into this heap, we will have to upload the data using // an upload heap device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT), // a default heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_COPY_DEST, // we will start this heap in the copy destination state since we will copy data // from the upload heap to this heap nullptr, // optimized clear value must be null for this type of resource. used for render targets and depth/stencil buffers IID_PPV_ARGS(&vertexBuffer)); // we can give resource heaps a name so when we debug with the graphics debugger we know what resource we are looking at vertexBuffer->SetName(L"Vertex Buffer Resource Heap"); // create upload heap // upload heaps are used to upload data to the GPU. CPU can write to it, GPU can read from it // We will upload the vertex buffer using this heap to the default heap ID3D12Resource* vBufferUploadHeap; device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), // upload heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_GENERIC_READ, // GPU will read from this buffer and copy its contents to the default heap nullptr, IID_PPV_ARGS(&vBufferUploadHeap)); vBufferUploadHeap->SetName(L"Vertex Buffer Upload Resource Heap"); // store vertex buffer in upload heap D3D12_SUBRESOURCE_DATA vertexData = {}; vertexData.pData = reinterpret_cast<BYTE*>(vList); // pointer to our vertex array vertexData.RowPitch = vBufferSize; // size of all our triangle vertex data vertexData.SlicePitch = vBufferSize; // also the size of our triangle vertex data // we are now creating a command with the command list to copy the data from // the upload heap to the default heap UpdateSubresources(commandList, vertexBuffer, vBufferUploadHeap, 0, 0, 1, &vertexData); // transition the vertex buffer data from copy destination state to vertex buffer state commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(vertexBuffer, D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER)); // Now we execute the command list to upload the initial assets (triangle data) commandList->Close(); ID3D12CommandList* ppCommandLists[] = { commandList }; commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists); // increment the fence value now, otherwise the buffer might not be uploaded by the time we start drawing fenceValue[frameIndex]++; hr = commandQueue->Signal(fence[frameIndex], fenceValue[frameIndex]); if (FAILED(hr)) { Running = false; } // create a vertex buffer view for the triangle. We get the GPU memory address to the vertex pointer using the GetGPUVirtualAddress() method vertexBufferView.BufferLocation = vertexBuffer->GetGPUVirtualAddress(); vertexBufferView.StrideInBytes = sizeof(Vertex); vertexBufferView.SizeInBytes = vBufferSize; ####Fill out a Viewport and Scissor Rect#### We need to specify a viewport and a scissor rect. The viewport we define will cover our entire render target. Commonly depth is between 0.0 and 1.0 in screen space. The viewport will stretch the scene from viewspace to screen space. After that we create a scissor rect. The scissor rect is defined in screen space. Anything outside the scissor rect will not even make it to the pixel shader. // Fill out the Viewport viewport.TopLeftX = 0; viewport.TopLeftY = 0; viewport.Width = Width; viewport.Height = Height; viewport.MinDepth = 0.0f; viewport.MaxDepth = 1.0f; // Fill out a scissor rect scissorRect.left = 0; scissorRect.top = 0; scissorRect.right = Width; scissorRect.bottom = Height; ####Draw!#### We finally get to the best part, drawing our scene! The first thing we do is set the root signature. The root signature set in the command list MUST be the same root signature as the one used when creating the PSO that is set by the time a draw call is created. After we set the root signature, we set the viewports and scissor rects. We have to specify the primitive topology order we want the Input Assembler to order the verticies by, which is triangle list in this tutorial. That means that every 3 vertices is one triangle (when we get to indexing, this means that every 3 indices is a triangle). Now we set the vertex buffer we want to draw. We bind the vertex buffer to the IA by calling IASetVertexBuffers() on the command list, and providing the start slot, number of views, and an array of vertex buffer views. We talked briefly about slots when creating the input layout above. We are only using one slot, so we set the first parameter to 0 (the first slot), and the second parameter to 1 (only one view). Finally we call DrawInstanced() to create a draw command. The first parameter is the number of vertices to draw, the second parameter is the number of instances to draw, the third parameter is the start index of the first vertex to draw, and the last parameter is a value added to each index before reading per-instance data from a vertex buffer (more on this in a later tutorial). // draw triangle commandList->SetGraphicsRootSignature(rootSignature); // set the root signature commandList->RSSetViewports(1, &viewport); // set the viewports commandList->RSSetScissorRects(1, &scissorRect); // set the scissor rects commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); // set the primitive topology commandList->IASetVertexBuffers(0, 1, &vertexBufferView); // set the vertex buffer (using the vertex buffer view) commandList->DrawInstanced(3, 1, 0, 0); // finally draw 3 vertices (draw the triangle) ####Clean Up#### Don't forget to clean up~ SAFE_RELEASE(pipelineStateObject); SAFE_RELEASE(rootSignature); SAFE_RELEASE(vertexBuffer); ####Debugging Shaders#### Visual studio comes with a valuable tool called graphics debugger. You debug the graphics pipeline with this tool by going to "Debug->Graphics->Start Diagnostics". Your app will begin to run and you will see some text at the top of your apps window with some stats. +[http://www.braynzarsoft.net/image/100221][Graphics Debugger Stats] Press the print screen key (PrtScn) on your keyboard, or click "Capture Frame" in visual studio. You will see a frame get captured in visual studio. +[http://www.braynzarsoft.net/image/100222][Capture Frame] Now PAUSE your application in visual studio. It will bring you to the line of code you paused on. Go back to the graphics debugger tab by clicking on the file in your tab well called something like "Report<number>.diagression". DOUBLE CLICK on the frame you just captured. A new window will open with the graphics debugger. In this window, you can see every command that was called that frame on the left side. +[http://www.braynzarsoft.net/image/100223][Graphics Debugger] Expand "ExecuteCommandList" to see all the commands that were executed on the GPU with that execute call. +[http://www.braynzarsoft.net/image/100224][Graphics Event List] Click on the DrawInstanced item and you will be shown results from the Input Assembler, Vertex Shader, Pixel Shader, and finally the output merger on the right side. You can debug the shaders here by clicking the play icon below the output window. +[http://www.braynzarsoft.net/image/100225][Shader Debugging] Looking back at the Graphics Event List on the left side of the screen, you will see blue text on some of the lines that say something like "obj:11". On the DrawInstanced line, click the obj:<number> text. on the right side you will see the state of the pipeline. Here you can check to make sure your state is correct for each of the pipeline stages if there is a problem. +[http://www.braynzarsoft.net/image/100226][Pipeline State Debugging] On the input assembler tab you will see Vertex Buffers for example. You can click on the name of the heap you created to see what that heap contains. +[http://www.braynzarsoft.net/image/100227][Resource Heap Debugging] The graphics debugger is a very useful tool, and i suggest you learn more about it. ####Source Code#### ##VertexShader.hlsl## float4 main(float3 pos : POSITION) : SV_POSITION { // just pass vertex position straight through return float4(pos, 1.0f); } ##PixelShader.hlsl## float4 main() : SV_TARGET { // return green return float4(0.0f, 1.0f, 0.0f, 1.0f); } ##stdafx.h## #pragma once #ifndef WIN32_LEAN_AND_MEAN #define WIN32_LEAN_AND_MEAN // Exclude rarely-used stuff from Windows headers. #endif #include <windows.h> #include <d3d12.h> #include <dxgi1_4.h> #include <D3Dcompiler.h> #include <DirectXMath.h> #include "d3dx12.h" #include <string> // this will only call release if an object exists (prevents exceptions calling release on non existant objects) #define SAFE_RELEASE(p) { if ( (p) ) { (p)->Release(); (p) = 0; } } // Handle to the window HWND hwnd = NULL; // name of the window (not the title) LPCTSTR WindowName = L"BzTutsApp"; // title of the window LPCTSTR WindowTitle = L"Bz Window"; // width and height of the window int Width = 800; int Height = 600; // is window full screen? bool FullScreen = false; // we will exit the program when this becomes false bool Running = true; // create a window bool InitializeWindow(HINSTANCE hInstance, int ShowWnd, bool fullscreen); // main application loop void mainloop(); // callback function for windows messages LRESULT CALLBACK WndProc(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam); // direct3d stuff const int frameBufferCount = 3; // number of buffers we want, 2 for double buffering, 3 for tripple buffering ID3D12Device* device; // direct3d device IDXGISwapChain3* swapChain; // swapchain used to switch between render targets ID3D12CommandQueue* commandQueue; // container for command lists ID3D12DescriptorHeap* rtvDescriptorHeap; // a descriptor heap to hold resources like the render targets ID3D12Resource* renderTargets[frameBufferCount]; // number of render targets equal to buffer count ID3D12CommandAllocator* commandAllocator[frameBufferCount]; // we want enough allocators for each buffer * number of threads (we only have one thread) ID3D12GraphicsCommandList* commandList; // a command list we can record commands into, then execute them to render the frame ID3D12Fence* fence[frameBufferCount]; // an object that is locked while our command list is being executed by the gpu. We need as many //as we have allocators (more if we want to know when the gpu is finished with an asset) HANDLE fenceEvent; // a handle to an event when our fence is unlocked by the gpu UINT64 fenceValue[frameBufferCount]; // this value is incremented each frame. each fence will have its own value int frameIndex; // current rtv we are on int rtvDescriptorSize; // size of the rtv descriptor on the device (all front and back buffers will be the same size) // function declarations bool InitD3D(); // initializes direct3d 12 void Update(); // update the game logic void UpdatePipeline(); // update the direct3d pipeline (update command lists) void Render(); // execute the command list void Cleanup(); // release com ojects and clean up memory void WaitForPreviousFrame(); // wait until gpu is finished with command list ID3D12PipelineState* pipelineStateObject; // pso containing a pipeline state ID3D12RootSignature* rootSignature; // root signature defines data shaders will access D3D12_VIEWPORT viewport; // area that output from rasterizer will be stretched to. D3D12_RECT scissorRect; // the area to draw in. pixels outside that area will not be drawn onto ID3D12Resource* vertexBuffer; // a default buffer in GPU memory that we will load vertex data for our triangle into D3D12_VERTEX_BUFFER_VIEW vertexBufferView; // a structure containing a pointer to the vertex data in gpu memory // the total size of the buffer, and the size of each element (vertex) ##main.cpp## #include "stdafx.h" using namespace DirectX; // we will be using the directxmath library struct Vertex { XMFLOAT3 pos; }; int WINAPI WinMain(HINSTANCE hInstance, //Main windows function HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd) { // create the window if (!InitializeWindow(hInstance, nShowCmd, FullScreen)) { MessageBox(0, L"Window Initialization - Failed", L"Error", MB_OK); return 1; } // initialize direct3d if (!InitD3D()) { MessageBox(0, L"Failed to initialize direct3d 12", L"Error", MB_OK); Cleanup(); return 1; } // start the main loop mainloop(); // we want to wait for the gpu to finish executing the command list before we start releasing everything WaitForPreviousFrame(); // close the fence event CloseHandle(fenceEvent); // clean up everything Cleanup(); return 0; } // create and show the window bool InitializeWindow(HINSTANCE hInstance, int ShowWnd, bool fullscreen) { if (fullscreen) { HMONITOR hmon = MonitorFromWindow(hwnd, MONITOR_DEFAULTTONEAREST); MONITORINFO mi = { sizeof(mi) }; GetMonitorInfo(hmon, &mi); Width = mi.rcMonitor.right - mi.rcMonitor.left; Height = mi.rcMonitor.bottom - mi.rcMonitor.top; } WNDCLASSEX wc; wc.cbSize = sizeof(WNDCLASSEX); wc.style = CS_HREDRAW | CS_VREDRAW; wc.lpfnWndProc = WndProc; wc.cbClsExtra = NULL; wc.cbWndExtra = NULL; wc.hInstance = hInstance; wc.hIcon = LoadIcon(NULL, IDI_APPLICATION); wc.hCursor = LoadCursor(NULL, IDC_ARROW); wc.hbrBackground = (HBRUSH)(COLOR_WINDOW + 2); wc.lpszMenuName = NULL; wc.lpszClassName = WindowName; wc.hIconSm = LoadIcon(NULL, IDI_APPLICATION); if (!RegisterClassEx(&wc)) { MessageBox(NULL, L"Error registering class", L"Error", MB_OK | MB_ICONERROR); return false; } hwnd = CreateWindowEx(NULL, WindowName, WindowTitle, WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, CW_USEDEFAULT, Width, Height, NULL, NULL, hInstance, NULL); if (!hwnd) { MessageBox(NULL, L"Error creating window", L"Error", MB_OK | MB_ICONERROR); return false; } if (fullscreen) { SetWindowLong(hwnd, GWL_STYLE, 0); } ShowWindow(hwnd, ShowWnd); UpdateWindow(hwnd); return true; } void mainloop() { MSG msg; ZeroMemory(&msg, sizeof(MSG)); while (Running) { if (PeekMessage(&msg, NULL, 0, 0, PM_REMOVE)) { if (msg.message == WM_QUIT) break; TranslateMessage(&msg); DispatchMessage(&msg); } else { // run game code Update(); // update the game logic Render(); // execute the command queue (rendering the scene is the result of the gpu executing the command lists) } } } LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam) { switch (msg) { case WM_KEYDOWN: if (wParam == VK_ESCAPE) { if (MessageBox(0, L"Are you sure you want to exit?", L"Really?", MB_YESNO | MB_ICONQUESTION) == IDYES) { Running = false; DestroyWindow(hwnd); } } return 0; case WM_DESTROY: // x button on top right corner of window was pressed Running = false; PostQuitMessage(0); return 0; } return DefWindowProc(hwnd, msg, wParam, lParam); } bool InitD3D() { HRESULT hr; // -- Create the Device -- // IDXGIFactory4* dxgiFactory; hr = CreateDXGIFactory1(IID_PPV_ARGS(&dxgiFactory)); if (FAILED(hr)) { return false; } IDXGIAdapter1* adapter; // adapters are the graphics card (this includes the embedded graphics on the motherboard) int adapterIndex = 0; // we'll start looking for directx 12 compatible graphics devices starting at index 0 bool adapterFound = false; // set this to true when a good one was found // find first hardware gpu that supports d3d 12 while (dxgiFactory->EnumAdapters1(adapterIndex, &adapter) != DXGI_ERROR_NOT_FOUND) { DXGI_ADAPTER_DESC1 desc; adapter->GetDesc1(&desc); if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE) { // we dont want a software device continue; } // we want a device that is compatible with direct3d 12 (feature level 11 or higher) hr = D3D12CreateDevice(adapter, D3D_FEATURE_LEVEL_11_0, _uuidof(ID3D12Device), nullptr); if (SUCCEEDED(hr)) { adapterFound = true; break; } adapterIndex++; } if (!adapterFound) { return false; } // Create the device hr = D3D12CreateDevice( adapter, D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&device) ); if (FAILED(hr)) { return false; } // -- Create a direct command queue -- // D3D12_COMMAND_QUEUE_DESC cqDesc = {}; cqDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE; cqDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT; // direct means the gpu can directly execute this command queue hr = device->CreateCommandQueue(&cqDesc, IID_PPV_ARGS(&commandQueue)); // create the command queue if (FAILED(hr)) { return false; } // -- Create the Swap Chain (double/tripple buffering) -- // DXGI_MODE_DESC backBufferDesc = {}; // this is to describe our display mode backBufferDesc.Width = Width; // buffer width backBufferDesc.Height = Height; // buffer height backBufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; // format of the buffer (rgba 32 bits, 8 bits for each chanel) // describe our multi-sampling. We are not multi-sampling, so we set the count to 1 (we need at least one sample of course) DXGI_SAMPLE_DESC sampleDesc = {}; sampleDesc.Count = 1; // multisample count (no multisampling, so we just put 1, since we still need 1 sample) // Describe and create the swap chain. DXGI_SWAP_CHAIN_DESC swapChainDesc = {}; swapChainDesc.BufferCount = frameBufferCount; // number of buffers we have swapChainDesc.BufferDesc = backBufferDesc; // our back buffer description swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; // this says the pipeline will render to this swap chain swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD; // dxgi will discard the buffer (data) after we call present swapChainDesc.OutputWindow = hwnd; // handle to our window swapChainDesc.SampleDesc = sampleDesc; // our multi-sampling description swapChainDesc.Windowed = !FullScreen; // set to true, then if in fullscreen must call SetFullScreenState with true for full screen to get uncapped fps IDXGISwapChain* tempSwapChain; dxgiFactory->CreateSwapChain( commandQueue, // the queue will be flushed once the swap chain is created &swapChainDesc, // give it the swap chain description we created above &tempSwapChain // store the created swap chain in a temp IDXGISwapChain interface ); swapChain = static_cast<IDXGISwapChain3*>(tempSwapChain); frameIndex = swapChain->GetCurrentBackBufferIndex(); // -- Create the Back Buffers (render target views) Descriptor Heap -- // // describe an rtv descriptor heap and create D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {}; rtvHeapDesc.NumDescriptors = frameBufferCount; // number of descriptors for this heap. rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; // this heap is a render target view heap // This heap will not be directly referenced by the shaders (not shader visible), as this will store the output from the pipeline // otherwise we would set the heap's flag to D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; hr = device->CreateDescriptorHeap(&rtvHeapDesc, IID_PPV_ARGS(&rtvDescriptorHeap)); if (FAILED(hr)) { return false; } // get the size of a descriptor in this heap (this is a rtv heap, so only rtv descriptors should be stored in it. // descriptor sizes may vary from device to device, which is why there is no set size and we must ask the // device to give us the size. we will use this size to increment a descriptor handle offset rtvDescriptorSize = device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV); // get a handle to the first descriptor in the descriptor heap. a handle is basically a pointer, // but we cannot literally use it like a c++ pointer. CD3DX12_CPU_DESCRIPTOR_HANDLE rtvHandle(rtvDescriptorHeap->GetCPUDescriptorHandleForHeapStart()); // Create a RTV for each buffer (double buffering is two buffers, tripple buffering is 3). for (int i = 0; i < frameBufferCount; i++) { // first we get the n'th buffer in the swap chain and store it in the n'th // position of our ID3D12Resource array hr = swapChain->GetBuffer(i, IID_PPV_ARGS(&renderTargets[i])); if (FAILED(hr)) { return false; } // the we "create" a render target view which binds the swap chain buffer (ID3D12Resource[n]) to the rtv handle device->CreateRenderTargetView(renderTargets[i], nullptr, rtvHandle); // we increment the rtv handle by the rtv descriptor size we got above rtvHandle.Offset(1, rtvDescriptorSize); } // -- Create the Command Allocators -- // for (int i = 0; i < frameBufferCount; i++) { hr = device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(&commandAllocator[i])); if (FAILED(hr)) { return false; } } // -- Create a Command List -- // // create the command list with the first allocator hr = device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, commandAllocator[frameIndex], NULL, IID_PPV_ARGS(&commandList)); if (FAILED(hr)) { return false; } // -- Create a Fence & Fence Event -- // // create the fences for (int i = 0; i < frameBufferCount; i++) { hr = device->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&fence[i])); if (FAILED(hr)) { return false; } fenceValue[i] = 0; // set the initial fence value to 0 } // create a handle to a fence event fenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr); if (fenceEvent == nullptr) { return false; } // create root signature CD3DX12_ROOT_SIGNATURE_DESC rootSignatureDesc; rootSignatureDesc.Init(0, nullptr, 0, nullptr, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT); ID3DBlob* signature; hr = D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, &signature, nullptr); if (FAILED(hr)) { return false; } hr = device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(&rootSignature)); if (FAILED(hr)) { return false; } // create vertex and pixel shaders // when debugging, we can compile the shader files at runtime. // but for release versions, we can compile the hlsl shaders // with fxc.exe to create .cso files, which contain the shader // bytecode. We can load the .cso files at runtime to get the // shader bytecode, which of course is faster than compiling // them at runtime // compile vertex shader ID3DBlob* vertexShader; // d3d blob for holding vertex shader bytecode ID3DBlob* errorBuff; // a buffer holding the error data if any hr = D3DCompileFromFile(L"VertexShader.hlsl", nullptr, nullptr, "main", "vs_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &vertexShader, &errorBuff); if (FAILED(hr)) { OutputDebugStringA((char*)errorBuff->GetBufferPointer()); return false; } // fill out a shader bytecode structure, which is basically just a pointer // to the shader bytecode and the size of the shader bytecode D3D12_SHADER_BYTECODE vertexShaderBytecode = {}; vertexShaderBytecode.BytecodeLength = vertexShader->GetBufferSize(); vertexShaderBytecode.pShaderBytecode = vertexShader->GetBufferPointer(); // compile pixel shader ID3DBlob* pixelShader; hr = D3DCompileFromFile(L"PixelShader.hlsl", nullptr, nullptr, "main", "ps_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &pixelShader, &errorBuff); if (FAILED(hr)) { OutputDebugStringA((char*)errorBuff->GetBufferPointer()); return false; } // fill out shader bytecode structure for pixel shader D3D12_SHADER_BYTECODE pixelShaderBytecode = {}; pixelShaderBytecode.BytecodeLength = pixelShader->GetBufferSize(); pixelShaderBytecode.pShaderBytecode = pixelShader->GetBufferPointer(); // create input layout // The input layout is used by the Input Assembler so that it knows // how to read the vertex data bound to it. D3D12_INPUT_ELEMENT_DESC inputLayout[] = { { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 } }; // fill out an input layout description structure D3D12_INPUT_LAYOUT_DESC inputLayoutDesc = {}; // we can get the number of elements in an array by "sizeof(array) / sizeof(arrayElementType)" inputLayoutDesc.NumElements = sizeof(inputLayout) / sizeof(D3D12_INPUT_ELEMENT_DESC); inputLayoutDesc.pInputElementDescs = inputLayout; // create a pipeline state object (PSO) // In a real application, you will have many pso's. for each different shader // or different combinations of shaders, different blend states or different rasterizer states, // different topology types (point, line, triangle, patch), or a different number // of render targets you will need a pso // VS is the only required shader for a pso. You might be wondering when a case would be where // you only set the VS. It's possible that you have a pso that only outputs data with the stream // output, and not on a render target, which means you would not need anything after the stream // output. D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; // a structure to define a pso psoDesc.InputLayout = inputLayoutDesc; // the structure describing our input layout psoDesc.pRootSignature = rootSignature; // the root signature that describes the input data this pso needs psoDesc.VS = vertexShaderBytecode; // structure describing where to find the vertex shader bytecode and how large it is psoDesc.PS = pixelShaderBytecode; // same as VS but for pixel shader psoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; // type of topology we are drawing psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM; // format of the render target psoDesc.SampleDesc = sampleDesc; // must be the same sample description as the swapchain and depth/stencil buffer psoDesc.SampleMask = 0xffffffff; // sample mask has to do with multi-sampling. 0xffffffff means point sampling is done psoDesc.RasterizerState = CD3DX12_RASTERIZER_DESC(D3D12_DEFAULT); // a default rasterizer state. psoDesc.BlendState = CD3DX12_BLEND_DESC(D3D12_DEFAULT); // a default blent state. psoDesc.NumRenderTargets = 1; // we are only binding one render target // create the pso hr = device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(&pipelineStateObject)); if (FAILED(hr)) { return false; } // Create vertex buffer // a triangle Vertex vList[] = { { { 0.0f, 0.5f, 0.5f } }, { { 0.5f, -0.5f, 0.5f } }, { { -0.5f, -0.5f, 0.5f } }, }; int vBufferSize = sizeof(vList); // create default heap // default heap is memory on the GPU. Only the GPU has access to this memory // To get data into this heap, we will have to upload the data using // an upload heap device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT), // a default heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_COPY_DEST, // we will start this heap in the copy destination state since we will copy data // from the upload heap to this heap nullptr, // optimized clear value must be null for this type of resource. used for render targets and depth/stencil buffers IID_PPV_ARGS(&vertexBuffer)); // we can give resource heaps a name so when we debug with the graphics debugger we know what resource we are looking at vertexBuffer->SetName(L"Vertex Buffer Resource Heap"); // create upload heap // upload heaps are used to upload data to the GPU. CPU can write to it, GPU can read from it // We will upload the vertex buffer using this heap to the default heap ID3D12Resource* vBufferUploadHeap; device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), // upload heap D3D12_HEAP_FLAG_NONE, // no flags &CD3DX12_RESOURCE_DESC::Buffer(vBufferSize), // resource description for a buffer D3D12_RESOURCE_STATE_GENERIC_READ, // GPU will read from this buffer and copy its contents to the default heap nullptr, IID_PPV_ARGS(&vBufferUploadHeap)); vBufferUploadHeap->SetName(L"Vertex Buffer Upload Resource Heap"); // store vertex buffer in upload heap D3D12_SUBRESOURCE_DATA vertexData = {}; vertexData.pData = reinterpret_cast<BYTE*>(vList); // pointer to our vertex array vertexData.RowPitch = vBufferSize; // size of all our triangle vertex data vertexData.SlicePitch = vBufferSize; // also the size of our triangle vertex data // we are now creating a command with the command list to copy the data from // the upload heap to the default heap UpdateSubresources(commandList, vertexBuffer, vBufferUploadHeap, 0, 0, 1, &vertexData); // transition the vertex buffer data from copy destination state to vertex buffer state commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(vertexBuffer, D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER)); // Now we execute the command list to upload the initial assets (triangle data) commandList->Close(); ID3D12CommandList* ppCommandLists[] = { commandList }; commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists); // increment the fence value now, otherwise the buffer might not be uploaded by the time we start drawing fenceValue[frameIndex]++; hr = commandQueue->Signal(fence[frameIndex], fenceValue[frameIndex]); if (FAILED(hr)) { Running = false; } // create a vertex buffer view for the triangle. We get the GPU memory address to the vertex pointer using the GetGPUVirtualAddress() method vertexBufferView.BufferLocation = vertexBuffer->GetGPUVirtualAddress(); vertexBufferView.StrideInBytes = sizeof(Vertex); vertexBufferView.SizeInBytes = vBufferSize; // Fill out the Viewport viewport.TopLeftX = 0; viewport.TopLeftY = 0; viewport.Width = Width; viewport.Height = Height; viewport.MinDepth = 0.0f; viewport.MaxDepth = 1.0f; // Fill out a scissor rect scissorRect.left = 0; scissorRect.top = 0; scissorRect.right = Width; scissorRect.bottom = Height; return true; } void Update() { // update app logic, such as moving the camera or figuring out what objects are in view } void UpdatePipeline() { HRESULT hr; // We have to wait for the gpu to finish with the command allocator before we reset it WaitForPreviousFrame(); // we can only reset an allocator once the gpu is done with it // resetting an allocator frees the memory that the command list was stored in hr = commandAllocator[frameIndex]->Reset(); if (FAILED(hr)) { Running = false; } // reset the command list. by resetting the command list we are putting it into // a recording state so we can start recording commands into the command allocator. // the command allocator that we reference here may have multiple command lists // associated with it, but only one can be recording at any time. Make sure // that any other command lists associated to this command allocator are in // the closed state (not recording). // Here you will pass an initial pipeline state object as the second parameter, // but in this tutorial we are only clearing the rtv, and do not actually need // anything but an initial default pipeline, which is what we get by setting // the second parameter to NULL hr = commandList->Reset(commandAllocator[frameIndex], pipelineStateObject); if (FAILED(hr)) { Running = false; } // here we start recording commands into the commandList (which all the commands will be stored in the commandAllocator) // transition the "frameIndex" render target from the present state to the render target state so the command list draws to it starting from here commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(renderTargets[frameIndex], D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET)); // here we again get the handle to our current render target view so we can set it as the render target in the output merger stage of the pipeline CD3DX12_CPU_DESCRIPTOR_HANDLE rtvHandle(rtvDescriptorHeap->GetCPUDescriptorHandleForHeapStart(), frameIndex, rtvDescriptorSize); // set the render target for the output merger stage (the output of the pipeline) commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, nullptr); // Clear the render target by using the ClearRenderTargetView command const float clearColor[] = { 0.0f, 0.2f, 0.4f, 1.0f }; commandList->ClearRenderTargetView(rtvHandle, clearColor, 0, nullptr); // draw triangle commandList->SetGraphicsRootSignature(rootSignature); // set the root signature commandList->RSSetViewports(1, &viewport); // set the viewports commandList->RSSetScissorRects(1, &scissorRect); // set the scissor rects commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); // set the primitive topology commandList->IASetVertexBuffers(0, 1, &vertexBufferView); // set the vertex buffer (using the vertex buffer view) commandList->DrawInstanced(3, 1, 0, 0); // finally draw 3 vertices (draw the triangle) // transition the "frameIndex" render target from the render target state to the present state. If the debug layer is enabled, you will receive a // warning if present is called on the render target when it's not in the present state commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(renderTargets[frameIndex], D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_PRESENT)); hr = commandList->Close(); if (FAILED(hr)) { Running = false; } } void Render() { HRESULT hr; UpdatePipeline(); // update the pipeline by sending commands to the commandqueue // create an array of command lists (only one command list here) ID3D12CommandList* ppCommandLists[] = { commandList }; // execute the array of command lists commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists); // this command goes in at the end of our command queue. we will know when our command queue // has finished because the fence value will be set to "fenceValue" from the GPU since the command // queue is being executed on the GPU hr = commandQueue->Signal(fence[frameIndex], fenceValue[frameIndex]); if (FAILED(hr)) { Running = false; } // present the current backbuffer hr = swapChain->Present(0, 0); if (FAILED(hr)) { Running = false; } } void Cleanup() { // wait for the gpu to finish all frames for (int i = 0; i < frameBufferCount; ++i) { frameIndex = i; WaitForPreviousFrame(); } // get swapchain out of full screen before exiting BOOL fs = false; if (swapChain->GetFullscreenState(&fs, NULL)) swapChain->SetFullscreenState(false, NULL); SAFE_RELEASE(device); SAFE_RELEASE(swapChain); SAFE_RELEASE(commandQueue); SAFE_RELEASE(rtvDescriptorHeap); SAFE_RELEASE(commandList); for (int i = 0; i < frameBufferCount; ++i) { SAFE_RELEASE(renderTargets[i]); SAFE_RELEASE(commandAllocator[i]); SAFE_RELEASE(fence[i]); }; SAFE_RELEASE(pipelineStateObject); SAFE_RELEASE(rootSignature); SAFE_RELEASE(vertexBuffer); } void WaitForPreviousFrame() { HRESULT hr; // swap the current rtv buffer index so we draw on the correct buffer frameIndex = swapChain->GetCurrentBackBufferIndex(); // if the current fence value is still less than "fenceValue", then we know the GPU has not finished executing // the command queue since it has not reached the "commandQueue->Signal(fence, fenceValue)" command if (fence[frameIndex]->GetCompletedValue() < fenceValue[frameIndex]) { // we have the fence create an event which is signaled once the fence's current value is "fenceValue" hr = fence[frameIndex]->SetEventOnCompletion(fenceValue[frameIndex], fenceEvent); if (FAILED(hr)) { Running = false; } // We will wait until the fence has triggered the event that it's current value has reached "fenceValue". once it's value // has reached "fenceValue", we know the command queue has finished executing WaitForSingleObject(fenceEvent, INFINITE); } // increment fenceValue for next frame fenceValue[frameIndex]++; }
A small difference people might not notice when transitioning from the previous tutorial to this one is hr = commandList->Reset(commandAllocator[frameIndex], pipelineStateObject); - the fact that nullptr was changed to the pso pointer.
on May 14 `16
lightxbulb
Another thing to note is that sometimes you may not see the IA and vertex shader stages in the vs graphics analyzer - here's a fix that worked for me - http://stackoverflow.com/questions/30126868/visual-studio-graphics-debugger-omits-working-pixel-shader (scroll to the bottom of the page).
on May 14 `16
lightxbulb
In the cleanup method you first assign frameIndex with i, and then call WaitForPreviousFrame. But then the first thing that happens in WaitForPreviousFrame is overwriting frameIndex.
on Apr 02 `17
GPUretarded
For those who are transitioning from previous tutorial note that we had "closed" our commandList at the time of its creation in the prev tut. But here we are recording the commandList with commands for copying vertex data from upload heap to default heap. Therefore we should remove that "commandList->Close()" statement at the time of commandList's creation.
on Dec 12 `17
aman2218
Thank you aman2218, i should probably have made that more clear
on Dec 12 `17
iedoc
failed to initiate direct3d 12
on Dec 19 `17
rekatha
if someone has error
failed to initiate direct3d 12 because copy paste codes above, try to change codes in PixelShader.hlsl with this
float4 main() : SV_TARGET
{
// return green
return float4(0.0f, 1.0f, 0.0f, 1.0f);
}
on Dec 19 `17
rekatha
Thank you rekatha. Could you point out what exactly you changed in the pixel shader? It looks the same to me
on Dec 19 `17
iedoc
Something confusing I found is that in your last tutorial, you ask us to close out the command list once it's created. However, in this tutorial, the command list was never closed after the creation. If I kept the command list closed like in last tutorial the triangle actually won't be drawn on the screen :/
on Mar 13 `22
hummingbird
@iedoc Hello iedoc,
Inside initD3D function, I wonder as to why a fence is required even though you've used a resource barrier for the "default" vertex buffer in GPU to transition it from D3D12_RESOURCE_STATE_COPY_DEST to D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER.
// increment the fence value now, otherwise the buffer might not be uploaded by the time we start drawing
fenceValue[frameIndex]++;
hr = commandQueue->Signal(fence[frameIndex], fenceValue[frameIndex]);
Isn't putting a resource barrier enough to ensure that CPU will stall waiting for the transition to occur and in that case, using a fence is redundant?
Thanks,
--ANURAG.
on Jul 16 `22
cat7skill@gmail.com
Sign in to comment